Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.databases.ms-sqlserver    |    Notorious Rube Goldberg contraption    |    19,505 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 18,750 of 19,505    |
|    migurus to rja.ca...@gmail.com    |
|    Re: table design for massive GPS data    |
|    14 Sep 12 10:25:00    |
      From: migurus@yahoo.com              On Friday, September 14, 2012 5:05:05 AM UTC-7, rja.ca...@gmail.com wrote:       > Off the top of a semi-qualified head:       >       >       >       > If the table is indexed well, then query performance will depend on the       >       > volume of data in a query, but not so much on the size of a table and the       >       > rows that you aren't looking at.       >       >              Thanks, I had an impression that hundreds of millions of small records is not       a sound design.              >       > If the typical query is all the records for one vehicle on one calendar day       >       > (one working day), then a primary key of date, vehicle id, time, looks       >       > possible. If you want to count work after midnight for the previous day,       >       > that could be messier - you could think about using datetimeoffset.       >       >       >       > I expect such a design also to be efficient when you are deleting       >       > records that are 5 years old every day.       >       >       >       > If you use datetime order, then vehicle, then the server has to scan       >       > records of 1000 vehicles to get the data that you actually want to see.       >       > On the other hand, if you want the query for 1000 vehicles, then       >       > it's probably okay if they're mixed together.       >              One vehicle at a time queries required.              >       >       > If data arrives in real time, then a clustered index of date, vehicle, time,       >       > may slow down updates, while favouring the query. Usually you want it       >       > that way around. The clustered index makes SQL Server physically store       >       > neighbouring records together. So you'll have 1000 distinct points in       >       > the data file where each vehicle's records are being added. You'd also       >       > see data filling pages that are then split into two, a lot. But I expect       >       > the server still to perform well in that case. There also may be an       >       > implication of the growth of the transaction log, but if data is inserted       >       > one row at a time then it won't make much of a difference - the log ray be       >       > pretty big whatever you do.       >              There is no updates, as I mentioned. Inserts only.       I love your idea of building three part index: date, vehicle_id, time.       This is right to the point and not a big overhead space wise.              >       >       > Your primary key is of course unique and doesn't allow NULL and by       >       > default is a clustered index, but none of this is necessary,       >       > including even having a primary key as such. Indexes that aren't       >       > the primary key will work just as well. The clustered index does       >       > matter; it's also desirable for its key to be short, since it's       >       > incorporated into any nonclustered indexes, so if there isn't a need       >       > to use the clustered index to reorganise the data then a clustered       >       > index on an IDENTITY(int, 1, 1) column can be pretty good.       >       > /Then/ you set your nonclustered primary key (which does have       >       > to be unique and not NULL). Including a non-meaningful IDENTITY       >       > in a primary key wouldn't be right, but it's right if it is,       >       > in fact, the key of your table in relationships with other data.       >       >       >       > The point is, if your data is not unique in the date, time, and       >       > vehicle ID fields, then don't get sidetracked into worrying about       >       > a primary key that you can't have. Alternatively, add 1 second       >       > to one of the duplicate key values, and it'll be unique.       >       >       >       > Primary key also is a constraint that must have a unique object name       >       > (PK_nameoftable probably), whereas you can and I do name other indexes       >       > IX_1, IX_2, IX_3, on every table in your database. This doesn't       >       > help other developers understand the database, but other developers       >       > don't often help me, so if they want to know what my indexes are,       >       > they can look 'em up. I probably have 500 nearly identical tables       >       > created at the same time. (Arguably I should have one table with       >       > 500 times as many rows and a couple of extra columns, but that isn't       >       > how we do things here.) So I don't want to invent 500 meaningful       >       > index names.       >       >       >       > By the way, this project sounds important enough that you'll be       >       > thinking carefully about data backup and recovering from disaster       >       > or downtime, if you don't already have that covered thoroughly       >       > for your platform. So, your thinking about backup and recovery       >       > also may bear on your database design. For instance, do you       >       > want that big, fat transaction log that holds everything that       >       > happened in your database between the last backup and the failure -       >       > or do you have files of the last 7 days' GPS data that you can       >       > re-load as part of your disaster recovery process, so that it       >       > doesn't matter if the database is blown away except for the       >       > twice-weekly taken-off-site backup?              I was thinking about an hourly log shipping and daily or weekly full backups.       Would really appreciate a comment on this idea.              Thanks for your thoughts.              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca