Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.databases.ms-sqlserver    |    Notorious Rube Goldberg contraption    |    19,505 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 18,749 of 19,505    |
|    migurus to Bob Barrows    |
|    Re: table design for massive GPS data    |
|    14 Sep 12 10:13:10    |
      From: migurus@yahoo.com              On Friday, September 14, 2012 4:35:11 AM UTC-7, Bob Barrows wrote:       > migurus wrote:       >       > > Windows Server/SQL Server 2008, we need to store GPS readings from       >       > > roughly 1000 vehicles, ~ 300 readings per vehicle per day. The data       >       > > comes in in real-time. Records are never updated after insert. That       >       > > gives us roughly 120 million records per year. We need to keep last 5       >       > > years of records.       >       > >       >       > > The usage of stored data is to query vehicle positions and event       >       > > attributes within entire day by given vehicle and date. Geospatial       >       > > features and functions are not really needed for the reporting.       >       > >       >       > > I'd like to discuss table design for these requirements.       >       > >       >       > > Here is a preliminary record content:       >       > > - vehicle id       >       > > - gps reading X and Y       >       > > - gps reading time       >       > > - one-letter event attribute       >       > > - event id (nullable)       >       > >       >       > > First idea is to keep data normalized in one table, one record per       >       > > GPS reading. It looks very simple to implement, but I doubt the       >       > > server will be happy with this high number of small records.       >       >       >       > Why not? Narrow and long tables are perfect for relational databases       >       >              Thanks, I did not know that, as I have limited exposure in db field.              >       > > I we       >       > > were to do this, what would be a good candidate for primary key here?       >       >       >       > What combination of columns identifies a row? Vehicle_ID and GPS_TIME? If       >       > so, there's your clustered primary key (wtp?)       >       > What is that event id and why is it nullable?              Some events do not require id, this will not be any foreign key or anything              >       >       >       >       >       > > Just for illustration I show table def for idea# 1       >       > > create table idea1 (       >       > > VEHICLE_ID int not null       >       > > , GPS_X real not null       >       > > , GPS_Y real not null       >       > > , GPS_TIME datetime not null       >       > > , EVENT_LETTER char not null       >       > > , EVENT_ID int null       >       > > );       >       > > I don't show indexes as I am not clear what to use.       >       > > The typical query would look like this:       >       > > select       >       > > GPS_X       >       > > , GPS_Y       >       > > , GPS_TIME       >       > > , EVENT_LETTER       >       > > , EVENT_ID       >       > > from       >       > > idea1       >       > > where       >       > > VEHICLE_ID=@id       >       > > and GPS_TIME between @from and @to       >       >       >       > The proposed clustered primary key above would suit this query perfectly       >       >       >       > >       >       > >       >       > > Second idea is to keep a small table where records are inserted as       >       > > they come in the structure shown above and then on the daily basis       >       > > aggregate it and store as one record per vehicle in the another       >       > > table, which will be used for queries. By doing this we shrink number       >       > > of records from 120 million to be ~ 350 thousands per year. This       >       > > looks much more manageable. The cons here are additional effort to       >       > > come up with the way of storing and retrieving gps readings as a       >       > > blob.       >       > >       >       > Nah, this data isn't really suitable for aggregation, IMO.              Thanks for your input.              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca