Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.databases.ms-sqlserver    |    Notorious Rube Goldberg contraption    |    19,505 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 18,747 of 19,505    |
|    Bob Barrows to migurus    |
|    Re: table design for massive GPS data    |
|    14 Sep 12 07:33:54    |
      From: reb01501@NOSPAMyahoo.com              migurus wrote:       > Windows Server/SQL Server 2008, we need to store GPS readings from       > roughly 1000 vehicles, ~ 300 readings per vehicle per day. The data       > comes in in real-time. Records are never updated after insert. That       > gives us roughly 120 million records per year. We need to keep last 5       > years of records.       >       > The usage of stored data is to query vehicle positions and event       > attributes within entire day by given vehicle and date. Geospatial       > features and functions are not really needed for the reporting.       >       > I'd like to discuss table design for these requirements.       >       > Here is a preliminary record content:       > - vehicle id       > - gps reading X and Y       > - gps reading time       > - one-letter event attribute       > - event id (nullable)       >       > First idea is to keep data normalized in one table, one record per       > GPS reading. It looks very simple to implement, but I doubt the       > server will be happy with this high number of small records.              Why not? Narrow and long tables are perfect for relational databases              > I we       > were to do this, what would be a good candidate for primary key here?              What combination of columns identifies a row? Vehicle_ID and GPS_TIME? If       so, there's your clustered primary key (wtp?)       What is that event id and why is it nullable?                     > Just for illustration I show table def for idea# 1       > create table idea1 (       > VEHICLE_ID int not null       > , GPS_X real not null       > , GPS_Y real not null       > , GPS_TIME datetime not null       > , EVENT_LETTER char not null       > , EVENT_ID int null       > );       > I don't show indexes as I am not clear what to use.       > The typical query would look like this:       > select       > GPS_X       > , GPS_Y       > , GPS_TIME       > , EVENT_LETTER       > , EVENT_ID       > from       > idea1       > where       > VEHICLE_ID=@id       > and GPS_TIME between @from and @to              The proposed clustered primary key above would suit this query perfectly              >       >       > Second idea is to keep a small table where records are inserted as       > they come in the structure shown above and then on the daily basis       > aggregate it and store as one record per vehicle in the another       > table, which will be used for queries. By doing this we shrink number       > of records from 120 million to be ~ 350 thousands per year. This       > looks much more manageable. The cons here are additional effort to       > come up with the way of storing and retrieving gps readings as a       > blob.       >       Nah, this data isn't really suitable for aggregation, IMO.              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca