home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.databases.ms-sqlserver      Notorious Rube Goldberg contraption      19,505 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 18,747 of 19,505   
   Bob Barrows to migurus   
   Re: table design for massive GPS data   
   14 Sep 12 07:33:54   
   
   From: reb01501@NOSPAMyahoo.com   
      
   migurus wrote:   
   > Windows Server/SQL Server 2008, we need to store GPS readings from   
   > roughly 1000 vehicles, ~ 300 readings per vehicle per day. The data   
   > comes in in real-time. Records are never updated after insert. That   
   > gives us roughly 120 million records per year. We need to keep last 5   
   > years of records.   
   >   
   > The usage of stored data is to query vehicle positions and event   
   > attributes within entire day by given vehicle and date. Geospatial   
   > features and functions are not really needed for the reporting.   
   >   
   > I'd like to discuss table design for these requirements.   
   >   
   > Here is a preliminary record content:   
   > - vehicle id   
   > - gps reading X and Y   
   > - gps reading time   
   > - one-letter event attribute   
   > - event id (nullable)   
   >   
   > First idea is to keep data normalized in one table, one record per   
   > GPS reading. It looks very simple to implement, but I doubt the   
   > server will be happy with this high number of small records.   
      
   Why not? Narrow and long tables are perfect for relational databases   
      
   > I we   
   > were to do this, what would be a good candidate for primary key here?   
      
   What combination of columns identifies a row? Vehicle_ID and GPS_TIME? If   
   so, there's your clustered primary key (wtp?)   
   What is that event id and why is it nullable?   
      
      
   > Just for illustration I show table def for idea# 1   
   > create table idea1 (   
   > VEHICLE_ID int not null   
   > , GPS_X real not null   
   > , GPS_Y real not null   
   > , GPS_TIME datetime not null   
   > , EVENT_LETTER char not null   
   > , EVENT_ID int null   
   > );   
   > I don't show indexes as I am not clear what to use.   
   > The typical query would look like this:   
   > select   
   > GPS_X   
   > , GPS_Y   
   > , GPS_TIME   
   > , EVENT_LETTER   
   > , EVENT_ID   
   > from   
   > idea1   
   > where   
   > VEHICLE_ID=@id   
   > and GPS_TIME between @from and @to   
      
   The proposed clustered primary key above would suit this query perfectly   
      
   >   
   >   
   > Second idea is to keep a small table where records are inserted as   
   > they come in the structure shown above and then on the daily basis   
   > aggregate it and store as one record per vehicle in the another   
   > table, which will be used for queries. By doing this we shrink number   
   > of records from 120 million to be ~ 350 thousands per year. This   
   > looks much more manageable. The cons here are additional effort to   
   > come up with the way of storing and retrieving gps readings as a   
   > blob.   
   >   
   Nah, this data isn't really suitable for aggregation, IMO.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca