加载中…
个人资料
  • 博客等级:
  • 博客积分:
  • 博客访问:
  • 关注人气:
  • 获赠金笔:0支
  • 赠出金笔:0支
  • 荣誉徽章:
正文 字体大小:

凯撒密码进阶:如何判别密文是随机生成的

(2019-08-28 21:47:25)
分类: 编程与数学Math
凯撒密码进阶:如何判别密文是随机生成的

"I chose Python as a working title for the project, being in a slightly irreverent mood (and a big fan of Monty Python's Flying Circus)." - Guido van Rossum

“我选择了巨蟒作为项目代号,稍显不敬,我又是巨蟒飞行马戏团的忠实粉丝”—“吉多·范·罗森



任务场景驱动


实例课程覆盖知识和技能点

丁丁猫创客的孩子30课学习

"Don't set out to learn Python. Choose a problem you're interested in and learn to solve it with Python."

不必马上开始python学习,先找一个你感兴趣的问题试着用python解决


 关键词


generated randomly 随机生成

import matplotlib.pyplot as plt

数据呈现结论 可视化

wrap-around  很难翻译,看下图意会吧

 

caesar 最古老的加密法


凯撒密码进阶:如何判别密文是随机生成的


高频场景

 


概要的都要学到手,本节用到函数

如何随机生成字符和文章段落

 

如何分辨随机生成文本的特征



import matplotlib.pyplot as plt
alphabet = "abcdefghijklmnopqrstuvwxyz"

code = """
swodkdbkfovvobpbywkxkxdsaeovkxngrycksndgyfkcdkxndbexuvoccvoqcypcdyx
ocdkxnsxdronocobdxokbdrowyxdrockxnrkvpcexukcrkddobonfsckqovsocgrycop
bygxkxngbsxuvonvszkxncxoobypmyvnmywwkxndovvdrkdsdccmevzdybgovvdrycoz
kccsyxcbokngrsmriodcebfsfocdkwzonyxdrocovspovoccdrsxqcdrorkxndrkdwym
uondrowkxndrorokbddrkdponkxnyxdrozonocdkvdrocogybnckzzokbwixkwoscyji
wkxnskcusxqypusxqcvyyuyxwigybuciowsqrdikxnnoczksbxydrsxqlocsnobowksx
cbyexndronomkiypdrkdmyvycckvgbomulyexnvocckxnlkbodrovyxokxnvofovckxn
ccdbodmrpkbkgki
"""

letter_counts = [code.count(l) for l in alphabet]
letter_colors = plt.cm.hsv([0.8*i/max(letter_counts) for i in letter_counts])

plt.bar(range(26), letter_counts, color=letter_colors)
plt.xticks(range(26), alphabet) # letter labels on x-axis
plt.tick_params(axis="x", bottom=False) # no ticks, only labels on x-axis
plt.title("Frequency of each letter")
plt.savefig("output.png")


上节课我们学习凯撒密码就是将原文的每个字母转换为对应的数字,比如采用PYTHON的ORD()函数,每个数字分别加或减去固定值-+ ,破解时逆运算即可

今天的挑战是敌方故意迷惑我方,丁丁猫的孩子们每人都拿到两份无序的字母文本,虽然看起来都是无序的,但还是有差别,其中一份是经过凯撒加密的有意义的情报。


首先如何分辨其中一份是经过凯撒密码加密,才能顺利还原情报原文。没有精力看英文的孩子,可以跳过本文所有的英文部分,只看中文部分足够理解本文。


回忆上节课的练手任务之二:密码中出现次数最多的字母是?

现在就要用到这个任务的结论了。


message = "

Once upon a midnight dreary, while I pondered, weak and weary,

Over many a quaint and curious volume of forgotten lore—

While I nodded, nearly napping, suddenly there came a tapping,

As of some one gently rapping, rapping at my chamber door—"


现在试着发现message中各个字母出现的频率,

并且为了便于发现规律,频率用图形化直观显示,

需要写代码统计之:


text = code

alphabet = "abcdefghijklmnopqrstuvwxyz"

def count_most(text):   #a-z遍历26字母表 

    bench,res = 0,sorted(text)

    for e in alphabet:

    #e_most是出现次数最多的字母,bench是出现总次数

        if res.count(e) > bench: 

            bench = res.count(e)

            e_most = e

    return e_most,bench

print(count_most(text))

('g', 27, 'z', ['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'd', 'd', 'e', 'e', 'e', 'e', 'e', 'f', 'f', 'f', 'f', 'f', 'f', 'f', 'f', 'f', 'f', 'f', 'f', 'f', 'f', 'f', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'h', 'h', 'h', 'i', 'i', 'i', 'i', 'i', 'i', 'i', 'i', 'i', 'j', 'j', 'j', 'j', 'j', 'j', 'j', 'j', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'm', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'n', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'r', 'r', 'r', 'r', 'r', 'r', 'r', 'r', 'r', 'r', 'r', 'r', 's', 't', 't', 't', 't', 't', 't', 't', 't', 't', 't', 't', 't', 't', 't', 't', 't', 't', 't', 't', 'u', 'u', 'u', 'u', 'u', 'u', 'u', 'u', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'w', 'w', 'w', 'w', 'w', 'w', 'w', 'x', 'x', 'x', 'y', 'y', 'y', 'y'])

结论是字母'g' 出现的次数最多是27次!


okay!现在我们需要一点想象力:

两个文本,一个是随机生成的字母,一个是有意义的文章,这之间是否存在某些特征可以区分两者的不同?


重点来了,随机意味着每个字母出现的频率是平均的,上面例子证明了我们的猜测,g出现的频率明显偏高!  为证明我们的猜想,选择两个文本分别呈现频率分布

 

https://mmbiz.qpic.cn/mmbiz_png/96iaDuCe2JMia11zHhuEUNu08ELpW84ib0ia5OjnHbMQCfnxoKSQ33lU7xhk7LEabUwCIm1zKnMXhGKM2WibfJus5Mg/640?wx_fmt=png


Text1 = 

"yfdpcpoplhhwdpssbjnsqvtlcpzpxqugtjphvgotuvwxufgoqigxwgkskduooyeuoue

fjlnmsqpgxrmcseeliswdheywseqgcbeothskxdzekgxmmkildjnaqbukprpfaaknsu

qpdwayqaqfxsoapvsgreqydqjnkpjghvrkygtidzibhrqkmocukhcunpjcazzvomtsc

fgycwfltmiegaejwcqrgsnxxcbtcrckufwsdxdhbxgppxcuzapbdhftzmugryfseavv

bssqlxanvmfwwzityziixasivzkmvtfczqmdgkabcnjbyhaoealengfptuedlmvryeb

titbwqkekzdpmbtiphdkwwiduassvbgalxgrfhrjrjplxpujrprqzcpcdqsjorigazt

kwwlnwbjryrzhgcttroyemuwwixwufymnknirzmexyowobvardlqktzajzoijwulomg

ztefdpftjealzapcgipgaaspuzxklvd"



Text 2 = 

"swodkdbkfovvobpbywkxkxdsaeovkxngrycksndgyfkcdkxndbexuvoccvoqcypcdyx

ocdkxnsxdronocobdxokbdrowyxdrockxnrkvpcexukcrkddobonfsckqovsocgrycop

bygxkxngbsxuvonvszkxncxoobypmyvnmywwkxndovvdrkdsdccmevzdybgovvdrycoz

kccsyxcbokngrsmriodcebfsfocdkwzonyxdrocovspovoccdrsxqcdrorkxndrkdwym

uondrowkxndrorokbddrkdponkxnyxdrozonocdkvdrocogybnckzzokbwixkwoscyji

wkxnskcusxqypusxqcvyyuyxwigybuciowsqrdikxnnoczksbxydrsxqlocsnobowksx

cbyexndronomkiypdrkdmyvycckvgbomulyexnvocckxnlkbodrovyxokxnvofovckxn

ccdbodmrpkbkgki"


以上是两份情报,不清楚那一份是有价值的


Below you see two strings of letters. Both seem random, but one of them is a meaningful text encoded with a Caesar cipher. One way of telling coded messages apart from random noise is to look at the letter frequencies: if a few letters appear significantly more often than the rest, as is usually the case in written language, then the text is most likely not randomly generated.

To help you decide which text is which, here is a program that can show how often each letter appears as a bar graph. Copy each text into the indicated line and run the program to see it.

Which text contains a secret message?

 

 

发现两份文本呈现明显不同的特征:


凯撒密码进阶:如何判别密文是随机生成的

Text1 中各个字母出现的频率比较平均

Text2 中字母o/k/e/d明显高出平均不少


丁丁猫的孩子们应选择第二份文本破解,有兴趣的可以看下英文的解释

Correct answer: Text 2

Here are the letter distributions of both texts, side by side:The letters in the first text occur much more uniformly than in the second, where a few letters appear very often and a good portion of the alphabet almost not at all. This sort of uneven letter distribution is characteristic of a natural language text. The uniform distribution of the first text is a strong sign that it has been generated randomly.

猛戳链接


Python 入坑练手:凯撒密码


 

 大咖说 


"Don't set out to learn Python. Choose a problem you're interested in and learn to solve it with Python." - @jakevdp


Ready to meet other passionate Pythonistas and talk more #Python? @pybites has a rich and diverse community on Slack! Head over to CodeChalleng.es and join ...


"It’s not at all important to get it right the first time. It’s vitally important to get it right the last time.” - The Pragmatic Programmer

"第一次就能搞定代码并不重要。更重要的问题,最后一次是正确的。”- 务实的程序员


"Every great developer you know got there by solving problems they were unqualified to solve until they actually did it." - Patrick McKenzie

“每一个伟大的开发者都是解决他们没有能力解决的问题开始的,直到他们真正做到了。

-Patrick McKenzie

 


相关阅读


速查宝典之python cheat sheet

Python cheat sheet入坑之2

Python cheat sheet入坑之3

Python cheat sheet人坑之4

Phython cheat sheet 之5 可读性


速查关键词


 1. Collections: 

List, Dictionary, Set, Tuple, Range, Enumerate, Iterator, Generator.

 2. Types:  

Type, String, Regular_Exp, Format, Numbers, Combinatorics, Datetime

 3. Syntax:   

Args, Inline, Closure, Decorator, Class, Duck_Types, Enum, Exceptions

 4. System:  

Print, Input, Command_Line_Arguments, Open, Path, Command_Execution.

 5. Data:  

CSV, JSON, Pickle, SQLite, Bytes, Struct, Array, MemoryView, Deque.

 6. Advanced:   

Threading, Operator, Introspection, Metaprograming, Eval, Coroutine.

 7. Libraries:  

Progress_Bar, Plot, Table, Curses, Logging, Scraping, Web, Profile,NumPy, Image, Audio.


0

阅读 收藏 喜欢 打印举报/Report
  

新浪BLOG意见反馈留言板 欢迎批评指正

新浪简介 | About Sina | 广告服务 | 联系我们 | 招聘信息 | 网站律师 | SINA English | 产品答疑

新浪公司 版权所有