Do complex problems demand complex solutions?
Do complex problems demand complex solutions? Personally, I don't think so. I think people are blinded by complex problems, in the sense that they can only see complex solutions. There are also many cases where complex problems are solved with complex solutions and simple problems are solved with complex solutions. Let's look at some examples of simple solutions for complex problems.
Map-reduceDo you need to solve something in parallel? Then using Map-Reduce is a pretty good idea. And playing around with map-reduce isn't that hard, here is a barebone implementation in Erlang:
pmap(F, L) ->
S = self(),
Pids = map(fun(I) ->
spawn(fun() -> do_f(S, F, I) end)
end, L),
gather(Pids).
gather([H|T]) ->
receive
{H, Ret} -> [Ret|gather(T)]
end;
gather([]) ->
[].
do_f(Parent, F, I) ->
Parent ! {self(), (catch F(I))}.
Spell checkingGoogle's director of reasearch, Peter Norvig, has created an amazing spell checker in only 21 lines of Python code. Given enough data, his spell checker beats spell checkers like Aspell - - since he solution uses statistics, where Aspell and the like use dictionary lookups. They have used the same basic idea for Google Translate, which is one of the best translator today. His spell checker implementation: import re, collections
def words(text): return re.findall('[a-z]+', text.lower())
def train(features):
model = collections.defaultdict(lambda: 1)
for f in features:
model[f] += 1
return model
NWORDS = train(words(file('big.txt').read()))
alphabet = 'abcdefghijklmnopqrstuvwxyz'
def edits1(word):
n = len(word)
return set([word[0:i]+word[i+1:] for i in range(n)] +
[word[0:i]+word[i+1]+word[i]+word[i+2:] for i in range(n-1)] +
[word[0:i]+c+word[i+1:] for i in range(n) for c in alphabet] +
[word[0:i]+c+word[i:] for i in range(n+1) for c in alphabet])
def known_edits2(word):
return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)
def known(words): return set(w for w in words if w in NWORDS)
def correct(word):
candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]
return max(candidates, key=lambda w: NWORDS[w])
QuickSortQuickSort is an sorting algorithm, it's worst case running time is O(n^2) [that's really bad]. But the average running time is O(n*log(n)) and in most cases it's a lot faster than other O(n*log(n)) algorithms, hence it's used as the standard sorting algorithm in most sorting libraries. QuickSort implemented in Haskell: qsort [] = [] qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs) It's a simple solution with a dumb worst case running time that in most cases beats all the other (and more complex) algorithms. MemcachedMemcached has changed the way we view and use caching. It's simple to use, it's simple to scale and it's a simple solution for a "complex" problem. Some time ago I reviewed memcached's implementation, check it out for more info. Distributed hash table (DHT)I am in the process of developing a distributed hash table. I have looked at a _lot_ of solutions for this problem, the largest of them are following:
The problem thought is that most of the solutions are _really_ complex and I don't think one needs a complex solution to solve the DHT problem. Here is a get and set method of my DHT: def get(key):
"""Lookup's the storage node in the
lookup ring and return's the stored value"""
storage_node = locate_node(key)
if storage_node:
return storage_node.get(key)
return None
def delete(key):
"""Lookup's the storage node in the lookup ring
and deletes the key from lookup ring and storage node"""
storage_node = locate_node(key)
for i, lookup_node in enumerate(lookup_ring.iterate_nodes(key)):
if i > 2:
break
lookup_node.delete(key)
if storage_node:
storage_node.delete(key)
return True
I look at my solution and think: damn, this is really simple, why didn't the others think of it? |
|