home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.databases.ms-sqlserver      Notorious Rube Goldberg contraption      19,505 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 18,749 of 19,505   
   migurus to Bob Barrows   
   Re: table design for massive GPS data   
   14 Sep 12 10:13:10   
   
   From: migurus@yahoo.com   
      
   On Friday, September 14, 2012 4:35:11 AM UTC-7, Bob Barrows wrote:   
   > migurus wrote:   
   >   
   > > Windows Server/SQL Server 2008, we need to store GPS readings from   
   >   
   > > roughly 1000 vehicles, ~ 300 readings per vehicle per day. The data   
   >   
   > > comes in in real-time. Records are never updated after insert. That   
   >   
   > > gives us roughly 120 million records per year. We need to keep last 5   
   >   
   > > years of records.   
   >   
   > >   
   >   
   > > The usage of stored data is to query vehicle positions and event   
   >   
   > > attributes within entire day by given vehicle and date. Geospatial   
   >   
   > > features and functions are not really needed for the reporting.   
   >   
   > >   
   >   
   > > I'd like to discuss table design for these requirements.   
   >   
   > >   
   >   
   > > Here is a preliminary record content:   
   >   
   > > - vehicle id   
   >   
   > > - gps reading X and Y   
   >   
   > > - gps reading time   
   >   
   > > - one-letter event attribute   
   >   
   > > - event id (nullable)   
   >   
   > >   
   >   
   > > First idea is to keep data normalized in one table, one record per   
   >   
   > > GPS reading. It looks very simple to implement, but I doubt the   
   >   
   > > server will be happy with this high number of small records.   
   >   
   >   
   >   
   > Why not? Narrow and long tables are perfect for relational databases   
   >   
   >   
      
   Thanks, I did not know that, as I have limited exposure in db field.   
      
   >   
   > > I we   
   >   
   > > were to do this, what would be a good candidate for primary key here?   
   >   
   >   
   >   
   > What combination of columns identifies a row? Vehicle_ID and GPS_TIME? If   
   >   
   > so, there's your clustered primary key (wtp?)   
   >   
   > What is that event id and why is it nullable?   
      
   Some events do not require id, this will not be any foreign key or anything   
      
   >   
   >   
   >   
   >   
   >   
   > > Just for illustration I show table def for idea# 1   
   >   
   > > create table idea1 (   
   >   
   > > VEHICLE_ID int not null   
   >   
   > > , GPS_X real not null   
   >   
   > > , GPS_Y real not null   
   >   
   > > , GPS_TIME datetime not null   
   >   
   > > , EVENT_LETTER char not null   
   >   
   > > , EVENT_ID int null   
   >   
   > > );   
   >   
   > > I don't show indexes as I am not clear what to use.   
   >   
   > > The typical query would look like this:   
   >   
   > > select   
   >   
   > > GPS_X   
   >   
   > > , GPS_Y   
   >   
   > > , GPS_TIME   
   >   
   > > , EVENT_LETTER   
   >   
   > > , EVENT_ID   
   >   
   > > from   
   >   
   > > idea1   
   >   
   > > where   
   >   
   > > VEHICLE_ID=@id   
   >   
   > > and GPS_TIME between @from and @to   
   >   
   >   
   >   
   > The proposed clustered primary key above would suit this query perfectly   
   >   
   >   
   >   
   > >   
   >   
   > >   
   >   
   > > Second idea is to keep a small table where records are inserted as   
   >   
   > > they come in the structure shown above and then on the daily basis   
   >   
   > > aggregate it and store as one record per vehicle in the another   
   >   
   > > table, which will be used for queries. By doing this we shrink number   
   >   
   > > of records from 120 million to be ~ 350 thousands per year. This   
   >   
   > > looks much more manageable. The cons here are additional effort to   
   >   
   > > come up with the way of storing and retrieving gps readings as a   
   >   
   > > blob.   
   >   
   > >   
   >   
   > Nah, this data isn't really suitable for aggregation, IMO.   
      
   Thanks for your input.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca