WinterHouse's Blog

technology

mautic简介

Jul 13, 2016

mautic

mautic是一个开源的自动营销平台，好像是目前唯一一个开源的自动营销平台吧，关于自动营销平台之前在学习hubspot的时候简单地提了一下，链接在这里。

简单来说利用mautic你能做的事情就是，监测网站的流量信息，记录下每一个用户的浏览动作，用户在网站登录的邮箱，用户的个人信息，在一个统一的平台上来管理他们，例如你可以给所有注册了你的网站但超过三天没有登录的用户发一封个性化定制的邮件。我理解程度有限不能说的特别深刻，总之mautic应该是目前唯一一个开源的inbound marketing框架了。

mautic是用sympony开发的，部署起来非常简单，我试着把它部署在heroku上，这里不得不提一下heroku的部署实在是太方便了，可以直接设github的resp地址，自动把代码部署上去，代码有更新也会自动更新，国内的云服务就并没有这样方便的功能。

redis sentinel介绍及一个小bug

Dec 3, 2015

redissentinel简介

redis sentinel是一种特殊的redis服务器，但仍然是一种redis服务器，在启动过程中，会将运行代码从redis代码替换为sentinel代码。

sentinel的主要作用是在主从环境下，在主redis宕机后，在从redis中选出新的主redis，保证分布式缓存能正常工作。

Mapreduce Part2

Nov 14, 2015

Map Reduce join

课程首先给出了一个关于join的例子，首先是文件的内容

testfile1

一个奇怪的连接池

Nov 12, 2015

object pool object pool pattern

项目里需要用到redis，同时也要有redis的连接池之类的东西，在github上找到了一个连接池(https://github.com/luca3m/redis3m),就是普通的连接池设计思想。

Pool里用一个set成员变量保存所有的连接，当调用getconnection成员方法的时候就在连接池里查找，如果有可用的连接，就获取返回该连接，并且从set中去掉该连接，当client使用完毕以后，再用put成员方法将connection put回连接池里，我觉得唯一需要修改的就是连接池没有最大连接数的限制。我已经pull request了,这也是我人生中第一个被不认识的人接受的request。。。

但是在看组里数据库的连接池的代码时，却发现组里的连接池并不是正常的连接池做法。

Mapreduce part 1

Nov 9, 2015

MapReduce

通过coursera课上一个hadoop最基本的例子来看mapreduce，统计单词出现的次数。

我们在hdfs上放置了两个文件，testfile1和testfile2

testfile1: A long time ago in a galaxy far far away testfile2: Another episode of Star Wars

MapReduce定义了如下的Map和Reduce两个抽象的编程接口，由我们来实现:

map: (source data) → [(k1; v1)]

map接受的输入：原始数据

map的输出：将原始文件处理后输出的键值对

在统计单词出现次数这个例子中，map的输入是文本，输出是<word,1>


#!/usr/bin/env python   
#the above just indicates to use python to intepret this file

# ---------------------------------------------------------------
#This mapper code will input a line of text and output <word, 1>
# 
# ---------------------------------------------------------------

import sys             #a python module with system functions for this OS

# ------------------------------------------------------------
#  this 'for loop' will set 'line' to an input line from system 
#    standard input file
# ------------------------------------------------------------
for line in sys.stdin:  

#-----------------------------------
#sys.stdin call 'sys' to read a line from standard input, 
# note that 'line' is a string object, ie variable, and it has methods that you can apply to it,
# as in the next line
# ---------------------------------
    line = line.strip()  #strip is a method, ie function, associated
                         #  with string variable, it will strip 
                         #   the carriage return (by default)
    keys = line.split()  #split line at blanks (by default), 
                         #   and return a list of keys
    for key in keys:     #a for loop through the list of keys
        value = 1        
        print('{0}\t{1}'.format(key, value) ) #the {} is replaced by 0th,1st items in format list
                            #also, note that the Hadoop default is 'tab' separates key from the value

reduce: (k1; [v1]) → [(k2; v2)] 输入：由map输出的一组键值对[(k2; v2)] 将被进行合并处理将同样主键下的不同数值合并到一个列表[v2]中，故reduce的输入为(k1; [v1])

处理：对传入的中间结果列表数据进行某种整理或进一步的处理,并产生最终的某种形式的结果输出[(k3; v3)] 。

输出：最终输出结果[(k3; v3)]

在统计单词次数这个例子中，reduce的输出是<word,count>



#!/usr/bin/env python

# ---------------------------------------------------------------
#This reducer code will input a line of text and 
#    output <word, total-count>
# ---------------------------------------------------------------
import sys

last_key      = None              #initialize these variables
running_total = 0

# -----------------------------------
# Loop thru file
#  --------------------------------
for input_line in sys.stdin:
    input_line = input_line.strip()

    # --------------------------------
    # Get Next Word    # --------------------------------
    this_key, value = input_line.split("\t", 1)  #the Hadoop default is tab separates key value
                          #the split command returns a list of strings, in this case into 2 variables
    value = int(value)           #int() will convert a string to integer (this program does no error checking)
 
    # ---------------------------------
    # Key Check part
    #    if this current key is same 
    #          as the last one Consolidate
    #    otherwise  Emit
    # ---------------------------------
    if last_key == this_key:     #check if key has changed ('==' is                                   #      logical equalilty check
        running_total += value   # add value to running total

    else:
        if last_key:             #if this key that was just read in
                                 #   is different, and the previous 
                                 #   (ie last) key is not empy,
                                 #   then output 
                                 #   the previous <key running-count>
            print( "{0}\t{1}".format(last_key, running_total) )
                                 # hadoop expects tab(ie '\t') 
                                 #    separation
        running_total = value    #reset values
        last_key = this_key

if last_key == this_key:
    print( "{0}\t{1}".format(last_key, running_total))

reducetasks 为 0时的输出


-rw-r--r--   1 cloudera supergroup          0 2015-11-14 01:57 /user/cloudera/output_word_0/_SUCCESS
-rw-r--r--   1 cloudera supergroup         61 2015-11-14 01:57 /user/cloudera/output_word_0/part-00000
-rw-r--r--   1 cloudera supergroup         39 2015-11-14 01:57 /user/cloudera/output_word_0/part-00001


A	1
long	1
time	1
ago	1
in	1
a	1
galaxy	1
far	1
far	1
away	1


Another	1
episode	1
of	1
Star	1
Wars	1

reducetasks 为 1时的输出


-rw-r--r--   1 cloudera supergroup          0 2015-11-14 02:05 /user/cloudera/output_word_1/_SUCCESS
-rw-r--r--   1 cloudera supergroup         94 2015-11-14 02:05 /user/cloudera/output_word_1/part-00000


A	1
Another	1
Star	1
Wars	1
a	1
ago	1
away	1
episode	1
far	2
galaxy	1
in	1
long	1
of	1
time	1

reducetasks 为 2时的输出


-rw-r--r--   1 cloudera supergroup          0 2015-11-14 02:14 /user/cloudera/output_word_2/_SUCCESS
-rw-r--r--   1 cloudera supergroup         64 2015-11-14 02:14 /user/cloudera/output_word_2/part-00000
-rw-r--r--   1 cloudera supergroup         30 2015-11-14 02:14 /user/cloudera/output_word_2/part-00001


A	1
Another	1
Wars	1
a	1
ago	1
episode	1
far	2
in	1
of	1
time	1


Star	1
away	1
galaxy	1
long	1

我们可以看到在不进行reduce的时候，输出就是map的输出，当有一个reducetask的时候，所有的key，value都被传到这个reduce中。而有两个reduce的时候，key value在被按key合并后就拆分到了两个reducetask中。

QT杂谈

Oct 21, 2015

QT字符串

当QString和std::string互相转换时，容易出现很多问题 Convert QString to string QString是使用utf-16编码的，但是std::string可能会是很多其他不同的编码格式。string自身并没有编码格式的概念，但是string里byte的来源会有很多种不同的编码格式。我们常常使用的QString函数包括toUtf8,fromUtf8,toLocal8bit,fromLocal8bit,toStdString,fromStdString。

首先是编码格式的问题coding，unicode并不是一种特定的编码格式，而是一个标准，在不同的地方unicode编码可能对应不同的编码格式，通常是utf-16，大家通常区分的utf-8和unicode实际上是utf-16和utf-8的竞争utf-16&utf-8

你的计算机上可能会有ascii，utf-8，local8bit（就是你本地的编码格式，同样是8位，但是并不是统一的unicode编码格式），utf-16等等编码格式，要想保证QString的工作正常，最好的方法自然是保证QString的来源和输出都是使用了正确地对应的to，from方法，tostdstring和fromstdstring都是使用的ascii方法。但是如果你不知道自己的string来源是什么样的编码格式，那么也没有办法保证你的QString是正常工作的。

QT国际化，QString的出现是为了满足QT跨平台跨地域的需要，因此使用了utf的编码格式，无论是从文件中读取还是其他方式获取的byte，都尽量使用QT的方法来处理（使用QT的文件流，QT的http方法），这样可以保证你的byte不会被莫名其妙地改变某些东西。

SimpleHTTPTest

Oct 10, 2015

定时发送http请求


import sched, time
import httplib,sys
#from tester.models import Tasks


class planrequest:

    planner = sched.scheduler(time.time, time.sleep)
    plannum = 1
    planinterval = 10
    httpaddress = "127.0.0.1"
    httpbody = ""
    httpmethod = "POST"
    httpport = 8081
    taskresult = dict()
    taskid = ""
    
    def __init__(self, _plannum, _planinterval, _httpaddress, _httpbody, request):
        self.plannum = _plannum
        self.planinterval = _planinterval
        self.httpaddress = _httpaddress
        self.httpbody = _httpbody
        taskid = self.getclientip(request) + str(time.time())
        newtask = Tasks(task_text = taskid, task_body = self.httpbody, task_type = "P")
        newtask.save()

    def dowork(self, worker): 
        print "Doing work..."
        #your header
        headers = {"Content-type": "application/x-www-form-urlencoded","Accept": "text/plain"}
        conn = httplib.HTTPConnection(self.httpaddress, self.httpport)           
        conn.request(self.httpmethod, self.httppath, self.httpbody, headers)
        res = conn.getresponse()
        
        #do something with result
        
        if res.status not in self.taskresult:
            self.taskresult[res.status] = 1
        else:
            self.taskresult[res.status] += 1
        #check if all done.
        if self.plannum > 0 :
            self.plannum -= 1
            worker.enter(float(self.planinterval), 1, self.dowork, (worker, ))
        else:
            #all done.
            #doingtask = Tasks.objects.get(task_text = self.taskid)
            #doingtask.task_result = self.taskresult
            #doingtask.save()
            
            
    def start(self):
        self.planner.enter(float(self.planinterval), 1, self.dowork, (self.planner, ))
        print time.time()
        self.planner.run()
    
    def getclientip(self, request):
        try:
            real_ip = request.META['HTTP_X_FORWARDED_FOR']
            reqip = real_ip.split(",")[0]
        except:
            try:
                reqip = request.META['REMOTE_ADDR']
            except:
                reqip = ""
        return reqip

A simple example for graphviz

Aug 8, 2015

digraph asde91 { ranksep=.75; size = “7.5,7.5”; bgcolor = “cornsilk3” { node [shape=plaintext, fontsize=16]; “Empire” -> “Kingdom”->”Phylum”->”Class”->”Order”->”Family”->”Genus”->”Species”; } node [shape=box,style=filled,color = “dodgerblue4”]; { rank = same; “Life”} node [shape=box,style=filled,color = “dodgerblue3”]; { rank = same; “Empire”; “Prokaryota”; “Eukaryote”; } node [shape=box,style=filled,color = “dodgerblue2”]; { rank = same; “Kingdom”; “Bacteria”; “Protozoa”;”Chromista”;”Plantae”;”Fungi”;”Animalia” } node [shape=box,style=filled,color = “dodgerblue1”]; { rank = same; “Phylum”; “vertebrates”;”molluscs”;”arthropods”;”annelids”;”sponges”;”jellyfish”} node [shape=box,style=filled,color = “dodgerblue”]; { rank = same; “Class”; “Agnatha”;”Chondrichthyes”;”Osteichthyes”;”Amphibia”;”Reptilia”;”Aves”;”Mammalia”} node [shape=box,style=filled,color = “deepskyblue2”]; { rank = same; “Order”;”Artiodactyla”;”Carnivora”;”Cetacea”;”Chiroptera”;”Insectivora”;”Lagomorpha”;”Macroscelidea”;”Perissodactyla”;”Pholidota”;”Rodentia”;”Sirenia”;”Tubulidentata”;”Edentata”;”Hyracoidea”;”Condylarthra”;”Creodonta”;”Desmostylia”;”Embrithopoda”;”Primates”} node [shape=box,style=filled,color = “deepskyblue1”]; { rank = same; “Family”;”Hylobatidae”;”Hominidae”;”Callitrichidae”;”Cebidae”;”Aotidae”;”Pitheciidae”;”Atelidae”;”Cheirogaleidae”;”Daubentoniidae”;”Lemuridae”;”Lepilemuridae”;”Indriidae”} node [shape=box,style=filled,color = “deepskyblue”]; { rank = same; “Genus”;”Pongo”;”Gorilla”;”Pan”;”Homo”} node [shape=box,style=filled,color = “skyblue”]; { rank = same; “Species”;”erectus”,”habilis”,”ergaster”} “Life”->”Prokaryota”; “Life”->”Eukaryote”; “Prokaryota”->”Bacteria”; “Eukaryote”->”Protozoa”,”Chromista”,”Plantae”,”Fungi”,”Animalia” ; “Animalia”->”vertebrates”,”molluscs”,”arthropods”,”annelids”,”sponges”,”jellyfish”; “vertebrates”->”Agnatha”,”Chondrichthyes”,”Osteichthyes”,”Amphibia”,”Reptilia”,”Aves”,”Mammalia”; “Mammalia”->”Insectivora”,”Lagomorpha”,”Artiodactyla”,”Carnivora”,”Cetacea”,”Chiroptera”; “Mammalia”->”Macroscelidea”,”Perissodactyla”,”Pholidota”,”Rodentia”,”Sirenia”,”Tubulidentata”; “Mammalia”->”Edentata”,”Hyracoidea”,”Condylarthra”,”Creodonta”,”Desmostylia”,”Embrithopoda”,”Primates”; “Primates”->”Hylobatidae”,”Hominidae”,”Callitrichidae”,”Cebidae”,”Aotidae”,”Pitheciidae”,”Atelidae”,”Cheirogaleidae”,”Daubentoniidae”,”Lemuridae”,”Lepilemuridae”,”Indriidae”; “Hominidae”->”Pongo”,”Gorilla”,”Pan”,”Homo”; “Homo”->”erectus”,”habilis”,”ergaster”; }

HttpHeaders&HttpClient

May 10, 2015

一直对http的headers不是特别清楚，推荐两篇博客。

4-15：翻转单词顺序

Apr 15, 2015

问题：翻转单词顺序

翻转单词顺序，但单词中字母顺序保持不变。

4-13：链表的公共节点

Apr 13, 2015

问题：链表的公共节点

输入两个链表，找出他们的第一个公共节点。

4-11：丑数

Apr 11, 2015

问题：丑数

我们把只包含因子2，3，5的数成为丑数，求按从小到大顺序的第1500个丑数

丑数如：1，2，3，4，5，6，8

4-10：二叉树镜像

Apr 10, 2015

问题：二叉树镜像

输入二叉树，求二叉树的镜像。

4-08：判断子树

Apr 8, 2015

问题：判断子树

给定两颗二叉树A，B，判断A是否是B的子结构。

4-07：链表合并

Apr 7, 2015

问题：链表合并

合并两个已排序的链表

input: 1->3->5; 2->4->6

output: 1->2->3->4->5->6

4-03：字符替换

Apr 3, 2015

问题：字符替换

实现一个函数，将字符串中的每个空格替换为”%20。

input:”we are happy”

output:”we%20are%20happy”

阿里巴巴研发C++笔试

Apr 2, 2015

选择题

答选择题只有一个感受，数学不好抱憾终生= =，选择题差不多三分之一是数学方面，概率，排列组合之类的题目，三分之一的C++基础知识，三分之一的数据结构和算法，比如红黑树、二叉树。

3-31：还原二叉树

Mar 31, 2015

问题：还原二叉树

（1）根据一个二叉树的前序遍历和中序遍历还原二叉树。（2）根据一个二叉树的后序遍历和中序遍历还原二叉树。

3-30：调整数组顺序使奇数位于偶数前面

Mar 30, 2015

问题：调整数组顺序使奇数位于偶数前面

（1）输入一个整数数组，实现一个函数来调整该数组中数字的顺序，使得所有奇数位于数组的前半部分，所有偶数位于数组的后半部分。（2）考虑设计一个模式解决同类问题。

Search

Blog Categories

computer (2)

game (6)

technology (19)

education (13)

article (2)

writing (1)

random-topic (1)

Recent posts

11 Aug 2016 知乎专栏：公开课茶馆
13 Jul 2016 mautic简介
13 Jul 2016 Gm&ga
25 Apr 2016 东南大学人文讲座信息查询
22 Apr 2016 Camera,Exposure,Photograph公开课笔记

Tags

datastructure (10)

machinelearning (10)

growthhacker (1)

This blog is maintained by AIRRAYA

Get in touch with us at airrayagroup@gmail.com