Tech Trained Monkey

Everyday Problem Solvers

Tag Archives: Database

Why integers are lousy primary keys

When I was a noob (about 2 years ago) I took a freelance gig to develop a simple system for a small shop. Very basic stuff. Few weeks ago the owner called me saying that he was very happy but he was facing a problem… You see, his business grew and he now has 3 shops, all of them using my system, and he wants to consolidate on one database. Well, as you can imagine I only managed to accomplish this by replacing the primary keys. They were all integers, now they’re GUIDs. The reason? Integers clash very easily!  Read more of this post

Advertisements

When it’s time to change/move/replan my database?

Quite a while ago now, a “Database and DBMS” teacher assigned a project that we were supposed to research something that he would not talk about in class… some people gather in groups… I did my alone… I don’t like doing this kind of thing with other people because I hate when the idiot in the group says something like “Oh your wrong! I misunderstood the teacher, didn’t come to class, never worked with that and of course I didn’t studied jack-shit about what you just said but I know your wrong” (more about that in other posts)… I rather do it alone, and sell the “extra-spots” to people who didn’t had the time to do the paper… oh my god… back to the point… I did my paper about indexing and also prepared a killer presentation.

When I say that some people gasp on the fact that we have 2 semesters (1 year) of classes focused on databases and indexing was not even mentioned… luckily for me I don’t rely in the college to teach me anything… I learn and go there to get a degree (again more on that later)…

In the paper I developed an idea of a “threshold-line”, which would be an imaginary point that after which the time taken to a certain query to return would be so great that the results would not resolve the issue/demand. To illustrate lets imagine that you have to go to a party tonight and you have to pick your date before. You have 2 cars: one is top less, and the other not. What good is the weather report tomorrow morning to help you decide which car to use? Now try asking google what good is a search result that takes 0.7 sec…

I also worked on the idea that when you re-index your database your “threshold-points” increase, so you can’t reindex when your on the limit because you will burst your quota! the idea is to plan and determine the exact point where you can re-index and you would still be inside your “confort-zone”. One line of thought that i did not exposed on my presentation is that eventually re-indexing wont solve your performance problem.

A day will come when your single database server wont be able, even with a fully indexed file, to handle all the queries. At this point a natural answer comes to mind: Add more servers. But my point is: With 1 server you can handle X requests. With 2 servers you can’t handle 2X requests. Even very well archtected/engeniered/implemented programs cant scale like that. The main point is some applications are not so well built (archtected/engeniered/implemented), so when necessesity comes who has your back?

Lets face it: Planning might be fun to some, but its painfull for most of us. We are simply unable to predict the futere, and most of ours managers (at least mine) think that gathering data today to generate a projection for something else than sales is just time wasted. So, when we get bottle-necked, were not even close to getting that new server… We have to prove that we have a problem today, that will be solved by buying/renting/leasing a new server. The thing that most managers dont realize is that expansion plan is like a disater-recovery plan: “You will only use it if you need it. You might give it a try in a staging enviroment. But not really use it unless necessary.”

A important part of the plan that most people forget to study is “When I should start working?”. If your plan says to call the maintance guy once the pipes are broken, instead than when you realize that pressure is above normal, I must say that your plan could be better. In project management we call this risk mitigation: Identifing a possible problem and working towards that item to avoid a possible problem. If you start think that its might be time to add more servers when your servers are 90% and performance is long gone, I can say that it could be worst, I’ve witnessed it… Recently I client asked me when he should add a third server to his farm and I said “When your average load is about 45%”. He really said something like “I’m not paying your company for clowns”… Then I asked what he was expecting. He was expecting about 80%… then I replied “So, when 1 of your servers crash the other will crash two, because he will have to handle 160% load. If you operate 2 servers at 45% if one crash the other can absorb the hit and still has some margin for a spike.”

Planning changes (and sticking with the plan) is a act of maturity. Not everyone can do it. I have some problems with that. It is so important that i don’t recommend any system to go into production without a “grown” expectation plan at least on advanced stage of development. If you decide that its time when users start to complain, its already too late! Your product reputation is already stained!