2015年4月26日星期日

Python List和Tuple的区别

  面试中老是会问这个,还问使用场景,不过我一般只用list很少使用tuple,
没读过源码,简单的可以从以下几个方面说:
  1.列表里的内容是可以改变的,增删改都可以,tuple则不行:
>>> alist = [1,2,3,4]
>>> atuple = (1,2,3,4)
>>> atuple.append(1)
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'tuple' object has no attribute 'append'
>>> dir(atuple)
['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getslice__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'count', 'index']
>>> dir(alist)
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__delslice__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__setslice__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
可以看出tuple根本没有修改的方法.
2.在Python中,字典的key必须是可哈希的,不可变的,所以tuple可以作为字典的键,而list则不行:

>>> adict = {}
>>> atuple = (1,2,3)
>>> adict[atuple] = 'ok'
>>> adict
{(1, 2, 3): 'ok'}
>>> alist = [1,2,3]
>>> adict[alist] = 'not ok'
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unhashable type: 'list'
3.对于使用场景,tuple适合一些只读数据,如Python连接MySQL得到的结果就是使用tuple,
而list则在列表长度不固定或者需要有变动的数据中使用,另外,tuple的性能要比list好一些,
tuple比list更省内存:


>>> a = list(xrange(100000))
>>> a.__sizeof__()
900088
>>> b = tuple(xrange(100000))
>>> b.__sizeof__()
800024
而且更快,具体测试方法,参考stackoverflow

我和comcast网络


  前一段坐高铁的时候,翻阅上面的杂志,发现一篇吐槽美国网络巨头comcast的文章,说它如何坑,服务如何不好。我这土鳖也没用过comcast,无法评价它的服务到底如何。
  不过,comcast让我想起了在一家小公司打杂的经历,那时候需要对很多IP进行traceroute测量,简直是无尽的traceroute。然后对结果进行分析,得出一些结论,记得他们总说comcast的定位效果好,没想到现在离开了那里,依然可以看到comcast,不过是对comcast的吐槽。
  哈哈,网络真是把世界变得很小,我这个很少出省的人居然还能和大洋彼岸的ISP有点什么关系。
  最后再做一个traceroute玩玩,不知道下一次能玩traceroute是什么时候了:
➜  Desktop  traceroute www.comcast.com
traceroute to www-prd.g.comcast.com (96.114.156.20), 64 hops max, 52 byte packets
 1  10.10.0.1 (10.10.0.1)  87.486 ms  84.712 ms  88.840 ms
 2  106.187.33.3 (106.187.33.3)  94.097 ms  90.211 ms  85.432 ms
 3  124.215.199.125 (124.215.199.125)  83.513 ms  89.550 ms  91.969 ms
 4  * * otejbb206.int-gw.kddi.ne.jp (124.215.194.177)  110.009 ms
 5  pajbb002.int-gw.kddi.ne.jp (203.181.100.202)  222.177 ms
    pajbb002.int-gw.kddi.ne.jp (203.181.100.206)  214.200 ms
    pajbb001.int-gw.kddi.ne.jp (203.181.100.134)  205.113 ms
 6  ix-pa4.int-gw.kddi.ne.jp (111.87.3.42)  221.573 ms
    ix-pa4.int-gw.kddi.ne.jp (111.87.3.70)  225.931 ms  221.207 ms
 7  124.215.192.126 (124.215.192.126)  226.368 ms  216.431 ms  209.909 ms
 8  pos-3-15-0-0-cr01.ashburn.va.ibone.comcast.net (68.86.86.25)  263.942 ms
    pos-3-2-0-0-cr01.56marietta.ga.ibone.comcast.net (68.86.86.165)  227.324 ms
    pos-3-15-0-0-cr01.ashburn.va.ibone.comcast.net (68.86.86.25)  235.313 ms
 9  be-10919-cr01.1601milehigh.co.ibone.comcast.net (68.86.85.154)  265.006 ms  261.971 ms  239.354 ms
10  he-0-13-0-0-ar01.area4.il.chicago.comcast.net (68.86.94.126)  256.685 ms  253.493 ms  247.654 ms
11  te-0-1-0-0-ar02-d.potomac.co.ndcwest.comcast.net (68.86.206.2)  258.036 ms  254.292 ms  269.330 ms
12  162.151.27.218 (162.151.27.218)  241.044 ms  252.085 ms  259.350 ms
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  162.151.27.218 (162.151.27.218)  443.467 ms !X * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *
31  * 162.151.27.218 (162.151.27.218)  216.655 ms !X *
32  * * *
33  * * *
从路由器上看出来,从日本走到了美国,(124.215.192.126 搞不好还是在海底的路由器呢) 到了芝加哥(不知道为什么,芝加哥总让我想起一些吃的东西,比如芝士汉堡什么的...), 科罗拉多州 最后就不知道消逝在哪了。
如果使用MaxMind的IP数据库的话,可以看出数据包经过的路由器:
from subprocess import Popen, PIPE
import re
import pygeoip

def get_traceroute_res(hostname, max_hop=10):

  p = Popen(['traceroute', '-m', str(max_hop), hostname], stdin=PIPE, stdout=PIPE, stderr=PIPE)
  output, err = p.communicate()
  rc = p.returncode

  return output


def get_all_ip(result):
  reip = re.compile(r'(?<![\.\d])(?:\d{1,3}\.){3}\d{1,3}(?![\.\d])')
  return [i for i in reip.findall(result) if i]

if __name__ == '__main__':

  result = get_traceroute_res('comcast.com')
  ip_list = get_all_ip(result)
  gi = pygeoip.GeoIP("GeoLiteCity.dat")

  for ip in ip_list:
    ip_info = gi.record_by_addr(ip)
    try:
      print ip_info['country_name'], ip_info['latitude'], ip_info['longitude'], ip
    except Exception, error:
      pass
结果如下:
Japan 35.69 139.69 106.187.33.3
Japan 35.69 139.69 106.187.33.3
Japan 35.69 139.69 124.215.199.125
Japan 35.69 139.69 124.215.199.125
Japan 35.69 139.69 124.215.194.178
Japan 35.69 139.69 124.215.194.162
Japan 35.69 139.69 124.215.194.177
Japan 35.69 139.69 203.181.100.66
Japan 35.69 139.69 203.181.100.202
Japan 35.69 139.69 111.87.3.46
Japan 35.69 139.69 111.87.3.54
Japan 35.69 139.69 124.215.192.126
Japan 35.69 139.69 124.215.192.126
United States 38.0 -97.0 68.86.85.154
United States 38.0 -97.0 68.86.94.126

2015年4月10日星期五

netcat 使用笔记

Netcat笔记

netcat是一个用来调试网络数据的工具,对调试网络编程来说十分方便,同时也可以实现很多强大的功能,对于网络安全也有很重要的意义。以下是一些常用用法的说明:

1.作为远程登陆shell工具使用:

在服务端运行以下命令在12345端口listen连接,这里-e /bin/bash为连接上以后可以执行bash命令:

gcc:~ zookeep$ netcat -l -p 12345 -e /bin/bash

在另一个终端连接服务并执行一些命令:

gcc:Desktop zookeep$ netcat 10.223.138.163 12345
ls | grep test
test
perl -e 'print "hello"';
hello
python -c "print 123"
123
uname -a
Darwin gcc.local 13.3.0 Darwin Kernel Version 13.3.0: Tue Jun  3 21:27:35 PDT 2014; root:xnu-2422.110.17~1/RELEASE_X86_64 x86_64

2. -v选项,打印出得到的消息,-n选项,只接受点分IP地址,不再进行DNS解析:

如对于google.com

gcc:~ zookeep$ ping -c 4 google.com
PING google.com (74.125.200.113): 56 data bytes
64 bytes from 74.125.200.113: icmp_seq=0 ttl=48 time=117.342 ms
64 bytes from 74.125.200.113: icmp_seq=1 ttl=48 time=122.505 ms
64 bytes from 74.125.200.113: icmp_seq=2 ttl=48 time=118.082 ms
64 bytes from 74.125.200.113: icmp_seq=3 ttl=48 time=134.491 ms

--- google.com ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 117.342/123.105/134.491/6.864 ms

gcc:~ zookeep$ netcat -v -n 74.125.200.113 80
74.125.200.113 80 (http) open

如果使用-n时给出域名将无法解析:

gcc:~ zookeep$ netcat -v -n google.com  80
Error: Couldn't resolve host "google.com"

使用-v加域名是可以自动解析的:

gcc:~ zookeep$ netcat -v google.com 443
google.com [74.125.200.113] 443 (https) open

3.使用-lp进行listen并重定向对端输入:

gcc:~ zookeep$ netcat -v localhost 12345
localhost [127.0.0.1] 12345 (italk) open
hello world
test words from client
^CExiting.

在断掉client之后,server的当前目录下就会出现刚才输入的内容, 这个功能可以作为一个局域网下的通信工具。

4.作为轻量级端口扫描使用,更好的工具是nmap,打开12345端口以后,进行连接,可以得到源端口和目的端口的信息,以及是否连接成功:

gcc:~ zookeep$ nc -v localhost 12345
nc: connectx to localhost port 12345 (tcp) failed: Connection refused
found 0 associations
found 1 connections:
     1: flags=82
  outif lo0
  src 127.0.0.1 port 64609
  dst 127.0.0.1 port 12345
  rank info not available
  TCP aux info available

Connection to localhost port 12345 [tcp/italk] succeeded!
lll

5.一个复杂点的扫描实例,-r随机扫描端口列表,-w3超时时间3秒,-z使用0输入输出:

gcc:~ zookeep$ nc -v -n -r -w3 -z 127.0.0.1 10-15
nc: connectx to 127.0.0.1 port 15 (tcp) failed: Connection refused
nc: connectx to 127.0.0.1 port 11 (tcp) failed: Connection refused
nc: connectx to 127.0.0.1 port 10 (tcp) failed: Connection refused
nc: connectx to 127.0.0.1 port 13 (tcp) failed: Connection refused
nc: connectx to 127.0.0.1 port 12 (tcp) failed: Connection refused
nc: connectx to 127.0.0.1 port 14 (tcp) failed: Connection refused

6.从连接中得到一些信息,比如连接MySQL后得到版本(5.6.16):

gcc:Desktop zookeep$ netcat localhost 3306
J
5.6.16`QT3@'C\��'K$q.0DEnx34mysql_native_password

这个功能使用Python的简单实现:

import socket
import re

def get_mysql_version(ip_address):
  sock = socket.socket()
  addr = (ip_address, 3306)
  try:
    sock.connect(addr)
  except Exception, error:
    print error
  data = sock.recv(2046)
  re_obj = r'5.\d+.\d+'
  return re.findall(re_obj, data)[0]

if __name__ == '__main__':
  print 'MySQL version:', get_mysql_version('localhost')

7.进行HTTP请求:

gcc:~ zookeep$ netcat -v google.com 80
google.com [74.125.200.113] 80 (http) open
GET / HTTP/1.1

HTTP/1.1 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: http://www.google.com.sg/?gfe_rd=cr&ei=-J4nVa73EIWCuASZ_4HgAQ
Content-Length: 262
Date: Fri, 10 Apr 2015 09:59:20 GMT
Server: GFE/2.0
Alternate-Protocol: 80:quic,p=0.5


302 Moved

302 Moved

The document has moved here.

8.扫描本机的开放端口:

gcc:Desktop zookeep$ netcat -v -z localhost 1-65535
localhost [127.0.0.1] 21 (ftp) open
localhost [127.0.0.1] 80 (http) open
localhost [127.0.0.1] 443 (https) open
localhost [127.0.0.1] 631 (ipp) open
localhost [127.0.0.1] 3306 (mysql) open
localhost [127.0.0.1] 8021 (intu-ec-client) open
localhost [127.0.0.1] 25035 open
localhost [127.0.0.1] 27017 open

2015年4月2日星期四

You had me at hello.

        不记得第一次在哪里看到这句话(You had me at hello.),一开始不知道什么意思,后来百度,类似中文“一见钟情”的意思。不过,还是觉得原文更有画面感:两个陌生人,也许在他人相互介绍后,笑着说句"hello"(不过惯例应该是Nice to meet you啊,哈哈),然后内心各种澎湃。而一见钟情貌似有很多种方式,我能想到的多半是男生面对漂亮女生,瞬间感觉fucking high,就像You are beautiful中所描述的一样。

        我曾有过一见钟情,不过还好,没有在fucking high后,因再也见不到而投海自尽。为什么是一见钟情而不是You had me at hello,因为我的hello还未说出口就没有然后了。可见,'You had me at hello.'和"一见钟情"还是有区别的,可以想象的画面是这样的:两人互有好感以后,有天谈论起什么时候有的感觉,在耳边小声地说:You had me at hello.而一见钟情根据我的经历可能就没有那种耳鬓厮磨的甜腻了。极端者如James Blunt所演绎的那样。

       也许,有些文字在想象以后才能体会出其中的差别,或者在经历以后才能体验出正确的使用场景。

   
print "You had me at hello"