Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
I Want a New Data Store (for Craigslist) (zawodny.com)
73 points by blasdel on April 29, 2010 | hide | past | favorite | 25 comments


GenieDB http://www.geniedb.com is an interesting possibility for these kinds of migrations. It allows you to use NoSQL type stores with a MySQL interface, using SQL and even doing JOINs between MySQL and NoSQL tables. The idea is you can gradually adjust your application over time without a sudden transition where you have to rewrite all your database code.

(Disclosure: I'm on the advisory board.)


Hmm... Very interesting idea, thanks!


We've got a few TB in couchdb-lounge (we also wrote couchdb-lounge) so we think it works. Big pluses ... it's got a RESTful API which means that we don't need a special library to access it in any language ... if you can speak http you can talk to couch. This is especially important for us because there are lots of places where we like to talk to our databases asynchronously. It's a lot easier to find an async HTTP API than it is to find a random db API.


That's a great point about async libs. Look no farther than MySQL to see how much pain it can be to get a good one in your language of choice.


> to influence the organization of records on disk so that the most common queries will require very few seeks

might be revealing my lack of knowledge here but what tools/apis etc does one use when they want to work at this level? It seems like a good thing to do but I'd have no idea how to go about it. Enlightenment appreciated.


SQL has a CLUSTER command, for instance. See e.g. http://www.postgresql.org/docs/8.4/interactive/sql-cluster.h....


ahhh, thanks.


Reminds me of this article on how friendfeed implemented schemaless storage on top of MySQL:

http://bret.appspot.com/entry/how-friendfeed-uses-mysql

I always thought it was a very clever design.


Yup. It's a very interesting mix of using a subset of something very well known (MySQL) to get what you need done.


I know it's not NoSQL but wouldn't this be a very good candidate for RethinkDB? A few TBs of SSD is not that expensive for a major site. It also supports live schema changes and is a drop-in replacement for MySQL.


As luck would have it, I'm an advisor for RethinkDB. They're doing great work, but also aren't quite ready for prime time either. I'm very optimistic for their technology in an OLTP system, but this is not like an OLTP system at all.


Redis says that craigslist uses Redis. You have a link to a redis tutorial on your blog. Is this not your answer?


Redis is great for some things but is definitely not the right tool here. It's built mainly for serving up small chunks of info (or functionality) very quickly. And it kicks ass at that.

This is a system that probably sees even more writes than reads, and they're batched up mostly at night anyway. The longer term management of that infrastructure is a big part of the pain.


whoa

ALTER TABLE takes an unreasonably long time [...] Changes take over a month.

How about a RAID of SSD ? Or a Fusion-IO ioDrive?

Shouldn't they be an order of magnitude faster?

I'd love to see a benchmark comparison of a large db with SSD vs conventional (can't find one).

added: here's an old benchmark with an older slower SSD that is faster than RAID10 http://www.bigdbahead.com/?p=68 and an update with more details (and faster) here http://www.bigdbahead.com/?p=557


Yes and no. Fusion-io removes the io bottleneck but ALTER TABLE is still single threaded. So you're looking at a CPU bottleneck where only 1 core can be used.

We just did a lot of ALTER TABLE work on one of our clusters WITH fusion-io drives (2 in a RAID-0) for a 360GB table and it still took ~8 hours to finish.


I just had another thought in case you check back on this thread someday.

Do you know if FusionIO drives internally run as Raid5? Not your external Raid0, but since flash chips are so relatively small in capacity right now, they must using an array internally, maybe Raid5?

If so, Raid5 has a severe performance penalty when very small amount of data are being written (smaller than a single stripe) because it has to then go update the parity across the volume on every write and that defeats any caching.

We already know ALTER itself is slow but I wonder if your storage is working against you in this case because ALTER does a single write for every row in the column when rebuilding the table...?


So what was the CPU on that bottleneck, out of curiosity?

If single core is the problem, throw a massive clock rate at it?

The i5-680 is out this month and is 3.6ghz native ("turbo" near 4ghz) should do 5ghz with good air cooling (it's 32nm) and then you could use ECC memory to be certain in a mission-critical environment?


This was... lemme see. less /proc/cpuinfo

Intel(R) Xeon(R) CPU X5570 @ 2.93GHz

So it was a pretty high-end Nehalem proc already. And, yeah, we do have ECC RAM in the box, of course. Plus InnoDB does page-level checksumming to guard against corruption too.

But still, it was ~8 hours or so. Using all 8 cores on the box would have brought it down to ~1 hour if it could have been perfectly parallelized (and much of it could have been, I bet).


Ah okay, so even with the fastest clocks available today, the best you could hope for is cutting the 8 hours to 4, or one month to two weeks.

As far as the parallel cores it sounds like you already have thought about possible "end runs" around MySQL's "transaction safe" ALTER (which is probably why it's single threaded). InnoDB supposedly has a few performance improvements for ALTER over MyISAM so you are already taking advantage of that.


Memory ECC won't help if computations are incorrect.


Good computations won't help if your memory is corrupt.


That's not what soundsop was saying - he was just pointing out the obivous error in the suggestion to overclock the CPU, and then use ECC to be safe. The ECC, though a good idea by itself, will not help with any errors introduced by the overclock.


I think Riak has everything you need. It's worth a look anyway.


Yeah, a few folks have mentioned it already. And I remember seeing some good features in the video I watched about it several weeks ago...


So it is Cassandra then?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: