Sunday, November 2, 2014

nine algorithms that changed the future 筆記

1. search engine indexing

  • indexing, matching and ranking
  • word-location trick
  • NEAR
  • the meta word trick
  • in TITLE
2. PageRank

  • the hyperlink trick
  • the authority trick
  • the random surfer trick
  • web spam, link-based ranking algorithm, use mathematical with less computational expense
3. public key cryptography

  •  Encrypted with a shared secret
  • Establish shared secret in public
  • Paint-mixing trick; public/private color, public-private mixture, shared secret color
  • In real life, mixing: discrete exponential , unmix: discrete logarithm , diffuse-Hellman, (base, clock size)
  • Prime number clock size
  • The base must be primitive root of clock size, power of bade eventuallyh cycle through every possible value on the clock


4. Error correction 
Correct
Repetition trick
Redundancy trick; 7-4 hamming code(first error correcting code) correct 1 error in each 7 digits
Detect
Simple Checksum trick, staircase checksum trick

Detect and correct
Pinpoint trick; two-dimensional parity 

In practice 
Reed-Solomon code
Ethernet CRC32
Md5, 40 digits
sha1( all cryptographic function, protect against malicious alteration) 50digits
Sha256, 75digits
sha512, 150digits

Low density parity check(satellite tv)


5. Pattern Recognition:
[...]

6. Data compression:
lossless: run length encoding, same-as-earlier trick, shorter-symbol trick, huffman
lossy: leave-it-out trick, jpeg(divide by chuck)
LZ77, Shannon-Fano coding, Huffman is Fano's student at MIT


7. database
write-ahead logging (to-do list trick), each entry in the log is "idempotent", can be executed several times.

Replica, replicated databases keeps all copies of the database in sync all the time. Different than "backup".

rollback, we might need it when deadlock.
two-phase commit( prepare-then-commit trick)
virtual table (join)

All is to make sure consistency.

8. Digital signature
lockbox & padlock, a verifiable signature
=> in digital:
keys: numbers
act of locking/unlocking: multiplication in clock arithmetic (in fact, exponentiation)

(multiplicative padlock trick)
large message can applied a transformation called "cryptographic hash function" to reduce the size/length of the message.

locked value => Euclid's algorithm => key value:
Because integer factorization is hard. (quantum computing, probability)
In practice, verification of downloaded content. CA(trusted back): certificated authority

chain of digital certificates (inspected by your web browser)

9. What's computable?
Alan turing.
Halting problem: can't write a program output yes if the program will be terminated.
Undecidability

maybe you'll be musing to yourself one day, "Now wouldn't it be nice if my computer could do this for me" only to realize that it's an impossibility, because you wished-for task can be prove undecidable using the same method as for our crash-finding program

-End-

"knowledge of a programming language is essential for computer scientists, but it is merely a prerequisite: the main challenge is toe invent, adapt, and understand algorithms."