The ATLAS detector at CERN records the collisions that are delivered by the LHC at the rate of 40 million collisions per second. To cope with the amounts of data, sophisticated systems for online triggers and offline distributed data management has been developed for the LHC era. The ideas behind these concepts will be briefly discussed together with the rational for designing the accelerator and the experiments in this way. Before the data can be used in physics analysis, it has to undergo several steps such as reconstruction, calibration and data quality checks. This all falls in the area called data preparation in ATLAS, and in this talk I will give a non-technical overview of these activities that are essential for the data to be used to make potential discoveries.