Author: Matt

  • Programmer vs Entreprenur

    I went to school to learn how to program.  In fact I went to one of the best Computer Science schools in Canada – University of Waterloo.  While there I took all of the hardest courses I could fit into my schedule including 2 of the “big three” – Graphics and Real-time Operating Systems – which lead to many (MANY) nights in the lab that I never got home until 5am.

    I love writing software and I love reading about it.  In fact I estimate I do about 3 to 4 hours of reading every day!

    Over the last few years I have tempered my reading about computer related topics with learning entrepreneurship skills.  Things like sales, marketing, accounting, financing, leadership, and networking.  This evolution has changed me.  It opened me up to a lot of new ideas and points of view.

    There are a lot of entrepreneur skills that make a lot of sense for software development.

    Great sales people split test and measure the results of any particular approach they take.  They change headlines, colors, fonts, layouts, images etc to try and gather data about what works best.  They never stop testing new things.  Gathering these types of statistics overtime will improve the sales process and make drastically more money for the business.

    I have not seen many (if any) applications that are taking this type of approach to optimize.  Imagine how much better software could be if  there were easy ways to gather and visualize how it’s usage changes between versions as a result of UI changes or how real-world performance changes with a new algorithm or data structure.

    Entrepreneurs biggest problem is that they have lots of ideas but it’s hard to finish. It’s tough to get though the resistance to actually ship something before another idea comes up to provide an escape to avoid the fear of failure.  Solutions to dealing with this natural tendency is to not self sensor, be stupidly ignorant and stick to the core idea, and anticipate the resistance by pushing through with more focus as the deadline looms.

    These are things that I think programmers should embrace.  The first version of a program should be written as quickly as possible, with little thought put into performance or maintenance.  Anticipate that as you get to the later stages of writing a program the parts remaining will get harder and harder to get through (because you’ve skipped the hard stuff along the way).  When you get to the last few bits of a project the only thing remaining is usually the stuff you hate doing – UI code or graphics or tedious clean up.  Expect it.  Focus.  Do the work.

    Good entrepreneurs know that their opinion on their own ideas is fairly useless. Not until the product is available on the market where customers can prove it’s value by opening their wallet is anything for certain. Until the idea is proven in the market place it’s value is a guess.

    Likewise until a program is in the hands of users, programmers are just guessing at how it will be used, by whom, and with what level of knowledge. Therefore just get it out there and measure these sort of things after the fact. Don’t make too many guesses about a product before letting people use it. And recognize that until people use it your are just guessing.

    Most of the entrepreneurs I have met exude a tolerance for taking risks – for going after the big ideas. Have that same boldness for your own software development work and you’ll accomplish great things.

  • Schrödinger’s Programmer

    Schrödinger’s Programmer is a thought experiment. A real-life paradox which comes as a result of the Copenhagen interpretation of quantum mechanics. The thought experiment presents a programmer that may or may not have written software.

    You have a closed office. In this office is a computer (with internet access) and a software programmer. She is tasked with writing a piece of software that can be written in an hour. However there is an equal chance that she will instead find and read something interesting on reddit.com and accomplish nothing in an hour.

    After an hour has elapsed one would say that the software is finished if meanwhile she did work. The psi-function of the entire system would express this by having in it the both project completed and nothing done state mixed or smeared out in equal parts.

    It is typical of these cases that an indeterminacy originally restricted to whether or not something of interest is on reddit.com becomes transformed into macroscopic indeterminacy, which can then be resolved by direct observation by opening the office door. That prevents us from so naively accepting as valid a “blurred model” for representing reality. In itself, it would not embody anything unclear or contradictory. There is a difference between a shaky or out-of-focus photograph and a snapshot of clouds and fog banks.

  • Entrepreneurship Skills and Phases

    Within the last week I have nailed down the final set of features and bug fixes for Automatic Blog Machine. It’s now in a stage where the development is finished and I can start focusing on the sales and marketing push and finally release it out to people like you who may be interested in using it.

    There are a few different phases to go through in the creation of a business like this. Through my past failed experiences at entrepreneurship I have found that there’s a number of steps that require different skills which can create problems for a one man show like mine.

    To be successful you need (at a minimum):

    1. idea generation
    2. development and testing of the idea into an actual product or service
    3. financial and legal organization
    4. sales and marketing ideas and execution
    5. customer support

    The vast majority of people don’t have interest in doing all of these wildly different tasks.  They all require different skill sets and different motivations.   All of these steps can be broken down further revealing more mundane day to day details which can bog down even the most determined entrepreneur.

    The job of the entrepreneur is to learn which of these pieces they have the desire and motivation to see through themselves and which are better handled by paying someone else to do it for them.  Unfortunately sometimes the way to learn these things is to try and fail.

    I have on several occasions tried to get a business idea off the ground.  As the idea generator and the software developer I usually find myself stalling at the software testing phase.  It’s so boring.  Going though a product writing test code and manually checking interfaces takes time and effort which is often the downfall of the entire enterprise.

    Your mileage may vary, but for this latest project I was able to get through the testing phase with the extra external motivation I received by attending a conference and meeting people that were actually interested in buying the final product.  That gave me the confidence to get though it.

    For the financial and business organization stuff I decided to hire an accountant.  This may have been more beneficial than I expected.  Besides having someone else deal with the paperwork and keeping me from having to dig though reams of material to figure out what the best business structure to use, how to allocate shares and file and register everything with the government.  There were a number of side benefits to having someone else do the work:

    1. Makes it feel more legitimate – I have registered businesses myself, but when you do that it somehow feels less real.  You just hand in a few forms pay the fees and declare yourself President.  Having an accountant witness you declare yourself President somehow feels less like a scam.
    2. Provides external motivation – I told the accountant what I expect to be making and when I expect to launch.  I know he’ll see the numbers when tax time rolls around so there’s pressure to actually see it through and start generating some revenue.  Going back to the accountant with $0 in revenue to declare would be a huge embarrassment.
    3. If I had to figure out how to do everything it would have taken months of dragging my feet figuring out what forms go to whom and in what order.  The accountant got through everything in about 3 days.
    4. I now have an expert on my team to ask questions about how best to organize things.  He can quickly tell me how to organize my finances, and when is a good time to register holding companies, trusts, and how to issue additional shares.

    Dealing now with the mental switch going into sales and marketing is tough.  The technical aspects of software development suits my personality but switching then to the creative aspects of creating a marketing plan, recording videos, and being persuasive requires a completely different set of skills.

    Not many of the projects I have started made it to this stage.  Most die off much earlier.  But the few things that I have successfully managed to get a marketing plan for and get the product available for sale has taught me a few lessons.

    Getting something for sale is THE major turning point.  I have met quite a few people trying to make money online and very few of them actually get to the point where they have something for sale.  The few that do have something for sale are in a much better position – they can test different marketing strategies, split test offers, find and partner with others and run customer surveys.  Once you have something for sale there’s lots you can do to grow the business.

    For any business you simply can’t make money if you have nothing to sell.  Selling other peoples things can be done profitably but in my experience it’s difficult to compete against the army of people trying to do the same thing.

    Finally, customer support will land on you whether or not you are prepared for it once you have sold something.  There are two methods for customer support.  The classical approach is to have people answer customer questions and concerns either through a call center or some other  support center.  The Google approach is to move most of the support to the community through the use of wiki’s and forums – customers can help themselves and help other customers directly.

    These five categories of skills are the major distinct skills required in a small online business.  I’m sure that as I continue to learn and hit new stages of business growth new lessons will be learned and skills acquired.

  • Ben Franklin’s Daily Schedule

    Came across this today.

    The thing that struck me is

    1. Ben and I go to sleep at the same time, but he got up 2 hours earlier in the morning than I do.
    2. Two hour lunch breaks — good idea.
    3. Time to plan and time to reflect each day.

    I’m going to try to work this schedule for a few days and see how it works.

  • Mark V Shaney

    Mark V Shaney was a fake Usenet user whose posting were automatically generated using markov chains.  http://en.wikipedia.org/wiki/Mark_V_Shaney

    I read about this a few weeks ago and thought that it would be interesting to try and create a Mark V Shaney twitter account that would be trained using the twitter fire hose.  The result was the twitter account http://twitter.com/Mark_V_Shaney.

    The algorithm behind creating the tweets is to list all the triplets of words that appear.  Then for any two words it selects a third by looking for every instance where those two words appeared next to each other in the training text and choosing the third providing a statistical bias towards more frequent triplets.

    It has created such gems as:

    [blackbirdpie id=”47790828004970496″]

    [blackbirdpie id=”39541254765150209″]

    This was the first time I have tried to connect to the Twitter firehose and after a few tests I realized that the spritzer was more my speed for this application.  It took a few days to create a 10MB database of filtered tweets.  Filtering proved very important after I found that taking in all tweets doesn’t provide enough structure to produce anything relevant – there’s too many different languages and people that talk in various slangs, or use people’s names.  Once a strange word shows up in the markov selection process it can result in copying the rest of the tweet word for word.

    So here’s some python code that generates the tweets.  It makes use of sqlalchemy, python-twitter and tweepy libraries.  (Tweepy is the only one that I could find that would connect to the streaming API.)

    #!/usr/bin/env python
    
    import StringIO
    import random
    import time
    import sys
    from textwrap import TextWrapper
    from optparse import OptionParser
    
    from sqlalchemy import create_engine
    from sqlalchemy.orm import sessionmaker
    from sqlalchemy import Table, Column, Integer, String, MetaData, Date, DateTime, Float
    from sqlalchemy.schema import UniqueConstraint
    from sqlalchemy.ext.declarative import declarative_base
    
    import tweepy
    import twitter
    
    CONNSTRING='sqlite:///MarkVShaney.sqlite'
    
    TRAINING_SEARCH_KEYWORDS = ['senate', 'government', 'federal', 'usda', 'ftc', 'usaid', 'nasa', 'noaa', 'usajobs' 'congress']
    
    #for connecting to hose
    TWITTER_USER = ""
    TWITTER_PW = ""
    
    #for tweeting
    TWITTER_CONSUMER_KEY = ''
    TWITTER_CONSUMER_SECRET = ''
    TWITTER_ACCESS_TOKEN_KEY = ''
    TWITTER_ACCESS_TOKEN_SECRET = ''
    
    
    #command line options
    parser = OptionParser()
    parser.add_option('-l', '--listen', action='store_true', dest='listen', default=False, help='listen for tweets and build database')
    (options, args) = parser.parse_args()
    
    
    Base = declarative_base()
    class StreamWatcherListener(tweepy.StreamListener):
    
        status_wrapper = TextWrapper(width=60, initial_indent='    ', subsequent_indent='    ')
        engine = create_engine(CONNSTRING, echo=False)
    
        def __init__(self):
            self.metadata = Base.metadata
            self.metadata.create_all(self.engine)
            super(StreamWatcherListener, self).__init__()
    
        def on_status(self, status):
            try:
                if status.author.lang == 'en' and len(status.text.strip().split(' ')) > 15:
                    #print self.status_wrapper.fill(status.text)
                    #print '\n %s  %s  via %s\n' % (status.author.screen_name, status.created_at, status.source)
                    Session = sessionmaker(bind=self.engine)
                    session = Session()
                    tweet = Tweet(status.created_at, status.source, status.text, status.author.screen_name)
                    session.add(tweet)
                    session.commit()
            except Exception as ex:
                # Catch any unicode errors while printing to console
                print ex.args[0]
                # and just ignore them to avoid breaking application.
                pass
    
        def on_error(self, status_code):
            print 'An error has occured! Status code = %s' % status_code
            return True  # keep stream alive
    
        def on_timeout(self):
            print 'Snoozing Zzzzzz'
    
    class Tweet(Base):
        """
        This defines to sqlachemey how to store tweets in the database.
        """
        __tablename__ = 'tweets'
    
        id = Column(Integer, primary_key=True)
        date = Column(Date)
        source = Column(String)
        text = Column(String)
        screen_name = Column(String)
    
        def __init__(self, date, source, text, screen_name):
            self.date = date
            self.text = text
            self.screen_name = screen_name
            self.source = source
    
    
        def __repr__(self):
            return "%s - %s" % (str(self.screen_name), str(self.text))
    
    
    class Markov(object):
        def __init__(self, open_file):
            self.cache = {}
            self.open_file = open_file
            self.words = self.file_to_words()
            self.word_size = len(self.words)
            self.database()
    
        def file_to_words(self):
            self.open_file.seek(0)
            data = self.open_file.read()
            words = data.split()
            return words
        def triples(self):
            """ Generates triples from the given data string. So if our string were
                    "What a lovely day", we'd generate (What, a, lovely) and then
                    (a, lovely, day).
            """
            if len(self.words) < 3:
                return
            for i in range(len(self.words) - 2):
                yield (self.words[i], self.words[i+1], self.words[i+2])
        def database(self):
            for w1, w2, w3 in self.triples():
                key = (w1, w2)
                if key in self.cache:
                    self.cache[key].append(w3)
                else:
                    self.cache[key] = [w3]
        def generate_markov_text(self, size=25):
            seed = random.randint(0, self.word_size-3)
            seed_word, next_word = self.words[seed], self.words[seed+1]
            w1, w2 = seed_word, next_word
            gen_words = []
            for i in xrange(size):
                gen_words.append(w1)
                w1, w2 = w2, random.choice(self.cache[(w1, w2)])
            gen_words.append(w2)
            return ' '.join(gen_words)
    
    def tweet(message):
        """
        posts tweet
        """
        api = twitter.Api(consumer_key=TWITTER_CONSUMER_KEY, consumer_secret=TWITTER_CONSUMER_SECRET, access_token_key=TWITTER_ACCESS_TOKEN_KEY, access_token_secret=TWITTER_ACCESS_TOKEN_SECRET)
        status = api.PostUpdate(message)
        print message
    
    def train_markov():
        """Get the tweets from DB and push into markov for training
        """
        engine = create_engine(CONNSTRING, echo=False)
    
        metadata = Base.metadata
        metadata.create_all(engine)
    
        Session = sessionmaker(bind=engine)
        session = Session()
        tweets = session.query(Tweet).all()
        text = StringIO.StringIO()
        for tweet in tweets:
            try:
                text.write(' %s ' % tweet.text)
            except:
                pass
        mark = Markov(text)
    
        return mark
    
    
    def main():
        auth = tweepy.auth.BasicAuthHandler(TWITTER_USER, TWITTER_PW)
        stream = tweepy.Stream(auth, StreamWatcherListener(), timeout=None)
        stream.filter(None, TRAINING_SEARCH_KEYWORDS)
    
    
    if __name__ == '__main__':
        if options.listen:
            print "connecting to twitter hose"
            try:
                main()
            except KeyboardInterrupt:
                print '\nGoodbye!'
        else:
            mark = train_markov()
            message = mark.generate_markov_text()
            while len(message) > 140 or message.find("RT") >-1 :
                message = mark.generate_markov_text()
    
            tweet(message)
    
    
  • Python Imap Gmail

    Connecting to a Google Gmail account is easy with Python using the built in imaplib library. It’s possible to download, read, mark and delete messages in your gmail account by scripting it.

    Here’s a very simple script that prints out the latest email received:

    #!/usr/bin/env python
    
    import imaplib
    M=imaplib.IMAP4_SSL('imap.gmail.com', 993)
    M.login('myemailaddress@gmail.com','password')
    status, count = M.select('Inbox')
    status, data = M.fetch(count[0], '(UID BODY[TEXT])')
    
    print data[0][1]
    M.close()
    M.logout()
    

    As you can see. Not a lot of code required to login and check and email. However, imaplib provides just a very thin layer on the imap protocol and you’ll have to refer to the documentation on how imap works and the commands available to really use imaplib. As you can see in the fetch command the “(UID BODY[TEXT])” bit is a raw imap instruction. In this case I’m calling fetch with the size of the Inbox folder because the most recent email is listed last (uid of most recent message is count) and telling it to return the body text of the email. There are many more complex ways to navigate an imap inbox. I recommend playing with it in the interpreter and connecting directly to the server with telnet to understand exactly what is happening.

    Here’s a good resource for quickly getting up to speed with IMAP Accessing IMAP email accounts using telnet

  • Bash Keyboard Shortcuts

    Bash is an incredibly powerful shell, and being proficient with it can make a massive difference in your productivity. Small tips and tricks can sometimes make a big difference in how you work. The shortcuts I’ve listed below deal mostly with what is actually readline functionality and so they may work in many other command line situations and programs. This is not a complete list but just some of my favorites.

    Commands for Moving

    These are the basics. The real stand outs here are moving around the line by word it can save you plenty of time compared to navigating only with the arrow keys.

    • beginning-of-line (Ctrl-a)
      Move to the start of the current line.
    • end-of-line (Ctrl-e)
      Move to the end of the line.
    • forward-char (Ctrl-f)
      Move forward a character.
    • backward-char (Ctrl-b)
      Move back a character.
    • forward-word (Meta-f)
      Move forward to the end of the next word. Words are composed of alphanumeric characters (letters and digits).
    • backward-word (Meta-b)
      Move back to the start of the current or previous word. Words are composed of alphanumeric characters (letters and digits).
    • clear-screen (Ctrl-l)
      Clear the screen leaving the current line at the top of the screen. With an argument, refresh the current line without clearing the screen.

    Commands for Manipulating the History

    These can be lifesavers. Especially if you’re running the same or similar commands over and over. For example Ctrl-o is so much faster than pressing ‘up’ a bunch of times, then pressing ‘up’ the same number of times to get to the next command in the sequence – use Ctrl-o, or maybe even a keyboard macro.

    • accept-line (Newline, Return)
      Accept the line regardless of where the cursor is. If this line is non-empty, add it to the history list according to the state of the HISTCONTROL variable. If the line is a modified history line, then restore the
      history line to its original state.
    • previous-history (Ctrl-p)
      Fetch the previous command from the history list, moving back in the list.
    • next-history (Ctrl-n)
      Fetch the next command from the history list, moving forward in the list.
    • beginning-of-history (Meta-< )
      Move to the first line in the history.
    • end-of-history (Meta->)
      Move to the end of the input history, i.e., the line currently being entered.
    • reverse-search-history (Ctrl-r)
      Search backward starting at the current line and moving `up’ through the history as necessary. This is an incremental search.
    • forward-search-history (Ctrl-s)
      Search forward starting at the current line and moving `down’ through the history as necessary. This is an incremental search.
    • yank-nth-arg (Meta-Ctrl-y)
      Insert the first argument to the previous command (usually the second word on the previous line) at point. With an argument n, insert the nth word from the previous command (the words in the previous command begin
      with word 0). A negative argument inserts the nth word from the end of the previous command. Once the argument n is computed, the argument is extracted as if the “!n” history expansion had been specified.
    • yank-last-arg (Meta-., Meta-_)
      Insert the last argument to the previous command (the last word of the previous history entry). With an argument, behave exactly like yank-nth-arg. Successive calls to yank-last-arg move back through the history
      list, inserting the last argument of each line in turn. The history expansion facilities are used to extract the last argument, as if the “!$” history expansion had been specified.
    • shell-expand-line (Meta-Ctrl-e)
      Expand the line as the shell does. This performs alias and history expansion as well as all of the shell word expansions. See HISTORY EXPANSION below for a description of history expansion.
    • history-expand-line (Meta-^)
      Perform history expansion on the current line. See HISTORY EXPANSION below for a description of history expansion.
    • insert-last-argument (Meta-., Meta-_)
      A synonym for yank-last-arg.
    • operate-and-get-next (Ctrl-o)
      Accept the current line for execution and fetch the next line relative to the current line from the history for editing. Any argument is ignored.
    • edit-and-execute-command (Ctrl-xCtrl-e)
      Invoke an editor on the current command line, and execute the result as shell commands. Bash attempts to invoke $VISUAL, $EDITOR, and emacs as the editor, in that order.

    Commands for Changing Text

    • delete-char (Ctrl-d)
      Delete the character at point. If point is at the beginning of the line, there are no characters in the line, and the last character typed was not bound to delete-char, then return EOF.
    • quoted-insert (Ctrl-q, Ctrl-v)
      Add the next character typed to the line verbatim. This is how to insert characters like Ctrl-q, for example.
    • tab-insert (Ctrl-v TAB)
      Insert a tab character.
    • transpose-chars (Ctrl-t)
      Drag the character before point forward over the character at point, moving point forward as well. If point is at the end of the line, then this transposes the two characters before point. Negative arguments have no
      effect.
    • transpose-words (Meta-t)
      Drag the word before point past the word after point, moving point over that word as well. If point is at the end of the line, this transposes the last two words on the line.
    • upcase-word (Meta-u)
      Uppercase the current (or following) word. With a negative argument, uppercase the previous word, but do not move point.
    • downcase-word (Meta-l)
      Lowercase the current (or following) word. With a negative argument, lowercase the previous word, but do not move point.
    • capitalize-word (Meta-c)
      Capitalize the current (or following) word. With a negative argument, capitalize the previous word, but do not move point.

    Killing and Yanking

    Killing and yanking can be a tremendous time saver over copy/paste with the mouse.

    • kill-line (Ctrl-k)
      Kill the text from point to the end of the line.
    • backward-kill-line (Ctrl-x Backspace)
      Kill backward to the beginning of the line.
    • unix-line-discard (Ctrl-u)
      Kill backward from point to the beginning of the line. The killed text is saved on the kill-ring.
    • kill-word (Meta-d)
      Kill from point to the end of the current word, or if between words, to the end of the next word. Word boundaries are the same as those used by forward-word.
    • backward-kill-word (Meta-Backspace)
      Kill the word behind point. Word boundaries are the same as those used by backward-word.
    • shell-kill-word (Meta-d)
      Kill from point to the end of the current word, or if between words, to the end of the next word. Word boundaries are the same as those used by shell-forward-word.
    • shell-backward-kill-word (Meta-Backspace)
      Kill the word behind point. Word boundaries are the same as those used by shell-backward-word.
    • unix-word-Backspace (Ctrl-w)
      Kill the word behind point, using white space as a word boundary. The killed text is saved on the kill-ring.
    • delete-horizontal-space (Meta-\)
      Delete all spaces and tabs around point.

    Completing

    There are some powerful completing shortcuts.

    • complete (TAB)
      Attempt to perform completion on the text before point. Bash attempts completion treating the text as a variable (if the text begins with $), username (if the text begins with ~), hostname (if the text begins with
      @), or command (including aliases and functions) in turn. If none of these produces a match, filename completion is attempted.
    • possible-completions (Meta-?)
      List the possible completions of the text before point.
    • insert-completions (Meta-*)
      Insert all completions of the text before point that would have been generated by possible-completions.
    • complete-filename (Meta-/)
      Attempt filename completion on the text before point.
    • possible-filename-completions (Ctrl-x /)
      List the possible completions of the text before point, treating it as a filename.
    • complete-username (Meta-~)
      Attempt completion on the text before point, treating it as a username.
    • possible-username-completions (Ctrl-x ~)
      List the possible completions of the text before point, treating it as a username.
    • complete-variable (Meta-$)
      Attempt completion on the text before point, treating it as a shell variable.
    • possible-variable-completions (Ctrl-x $)
      List the possible completions of the text before point, treating it as a shell variable.
    • complete-hostname (Meta-@)
      Attempt completion on the text before point, treating it as a hostname.
    • possible-hostname-completions (Ctrl-x @)
      List the possible completions of the text before point, treating it as a hostname.
    • complete-command (Meta-!)
      Attempt completion on the text before point, treating it as a command name. Command completion attempts to match the text against aliases, reserved words, shell functions, shell builtins, and finally executable file‐
      names, in that order.
    • possible-command-completions (Ctrl-x !)
      List the possible completions of the text before point, treating it as a command name.
    • dynamiCtrl-complete-history (Meta-TAB)
      Attempt completion on the text before point, comparing the text against lines from the history list for possible completion matches.
    • complete-into-braces (Meta-{)
      Perform filename completion and insert the list of possible completions enclosed within braces so the list is available to the shell (see Brace Expansion above).

    Keyboard Macros

    These can be useful if you’re running the same few commands over and over. For example, when I’m working in my IDE and then want to run some tests, I can quickly create a macro the first time I run my couple of commands to clean, build, and run the tests. When I want to run that sequence again it’s very quick, and doesn’t require hunting/searching through the history.

    • start-kbd-macro (Ctrl-x ()
      Begin saving the characters typed into the current keyboard macro.
    • end-kbd-macro (Ctrl-x ))
      Stop saving the characters typed into the current keyboard macro and store the definition.
    • call-last-kbd-macro (Ctrl-x e)
      Re-execute the last keyboard macro defined, by making the characters in the macro appear as if typed at the keyboard.

    Miscellaneous

    • prefix-meta (ESC)
      Metafy the next character typed. ESC f is equivalent to Meta-f.
    • undo (Ctrl-_, Ctrl-x Ctrl-u)
      Incremental undo, separately remembered for each line.
    • tilde-expand (Meta-&)
      Perform tilde expansion on the current word.
    • set-mark (Ctrl-@, Meta-)
      Set the mark to the point. If a numeric argument is supplied, the mark is set to that position.
    • exchange-point-and-mark (Ctrl-x Ctrl-x)
      Swap the point with the mark. The current cursor position is set to the saved position, and the old cursor position is saved as the mark.
    • character-search (Ctrl-])
      A character is read and point is moved to the next occurrence of that character. A negative count searches for previous occurrences.
    • character-search-backward (Meta-Ctrl-])
      A character is read and point is moved to the previous occurrence of that character. A negative count searches for subsequent occurrences.
    • insert-comment (Meta-#)
      Without a numeric argument, the value of the readline comment-begin variable is inserted at the beginning of the current line. If a numeric argument is supplied, this command acts as a toggle: if the characters at
      the beginning of the line do not match the value of comment-begin, the value is inserted, otherwise the characters in comment-begin are deleted from the beginning of the line. In either case, the line is accepted as
      if a newline had been typed. The default value of comment-begin causes this command to make the current line a shell comment. If a numeric argument causes the comment character to be removed, the line will be exe‐
      cuted by the shell.
    • display-shell-version (Ctrl-x Ctrl-v)
      Display version information about the current instance of bash.
  • Python Web Crawler Script

    spider_webHere’s a simple web crawling script that will go from one url and find all the pages it links to up to a pre-defined depth. Web crawling is of course the lowest level tool used by Google to create its multi-billion dollar business. You may not be able to compete with Google’s search technology but being able to crawl your own sites, or that of your competitors can be very valuable.

    You could for instance routinely check your websites to make sure that it is live and all the links are working. it could notify you of any 404 errors. By adding in a page rank check you could identify better linking strategies to boost your page rank scores. And you could identify possible leaks – paths a user could take that takes them away from where you want them to go.

    Here’s the script:

    # -*- coding: utf-8 -*-
    from HTMLParser import HTMLParser
    from urllib2 import urlopen
    
    class Spider(HTMLParser):
        def __init__(self, starting_url, depth, max_span):
            HTMLParser.__init__(self)
            self.url = starting_url
            self.db = {self.url: 1}
            self.node = [self.url]
    
            self.depth = depth # recursion depth max
            self.max_span = max_span # max links obtained per url
            self.links_found = 0
    
        def handle_starttag(self, tag, attrs):
            if self.links_found < self.max_span and tag == 'a' and attrs:
                link = attrs[0][1]
                if link[:4] != "http":
                    link = '/'.join(self.url.split('/')[:3])+('/'+link).replace('//','/')
    
                if link not in self.db:
                    print "new link ---> %s" % link
                    self.links_found += 1
                    self.node.append(link)
                self.db[link] = (self.db.get(link) or 0) + 1
    
        def crawl(self):
            for depth in xrange(self.depth):
                print "*"*70+("\nScanning depth %d web\n" % (depth+1))+"*"*70
                context_node = self.node[:]
                self.node = []
                for self.url in context_node:
                    self.links_found = 0
                    try:
                        req = urlopen(self.url)
                        res = req.read()
                        self.feed(res)
                    except:
                        self.reset()
            print "*"*40 + "\nRESULTS\n" + "*"*40
            zorted = [(v,k) for (k,v) in self.db.items()]
            zorted.sort(reverse = True)
            return zorted
    
    if __name__ == "__main__":
        spidey = Spider(starting_url = 'http://www.7cerebros.com.ar', depth = 5, max_span = 10)
        result = spidey.crawl()
        for (n,link) in result:
            print "%s was found %d time%s." %(link,n, "s" if n is not 1 else "")
    
  • Amazon Product Advertising API From Python

    Product Advertising APIAmazon has a very comprehensive associate program that allows you to promote just about anything imaginable for any niche and earn commission for anything you refer. The size of the catalog is what makes Amazon such a great program. People make some good money promoting Amazon products.

    There is a great Python library out there for accessing the other Amazon web services such as S3, and EC2 called boto. However it doesn’t support the Product Advertising API.

    With the Product Advertising API you have access to everything that you can read on the Amazon site about each product. This includes the product description, images, editor reviews, customer reviews and ratings. This is a lot of great information that you could easily find a good use for with your websites.

    So how do you get at this information from within a Python program? Well the complicated part is dealing with the authentication that Amazon has put in place. To make that a bit easier I used the connection component from boto.

    Here’s a demonstration snippet of code that will print out the top 10 best selling books on Amazon right now.

    Example Usage:

    $ python AmazonSample.py
    Glenn Becks Common Sense: The Case Against an Out-of-Control Government, Inspired by Thomas Paine by Glenn Beck
    Culture of Corruption: Obama and His Team of Tax Cheats, Crooks, and Cronies by Michelle Malkin
    The Angel Experiment (Maximum Ride, Book 1) by James Patterson
    The Time Travelers Wife by Audrey Niffenegger
    The Help by Kathryn Stockett
    South of Broad by Pat Conroy
    Paranoia by Joseph Finder
    The Girl Who Played with Fire by Stieg Larsson
    The Shack [With Headphones] (Playaway Adult Nonfiction) by William P. Young
    The Girl with the Dragon Tattoo by Stieg Larsson
    

    To use this code you’ll need an Amazon associate account and fill out the keys and tag needed for authentication.

    Product Advertising API Python code:

    #!/usr/bin/env python
    # encoding: utf-8
    """
    AmazonExample.py
    
    Created by Matt Warren on 2009-08-17.
    Copyright (c) 2009 HalOtis.com. All rights reserved.
    """
    
    import urllib
    try:
        from xml.etree import ET
    except ImportError:
        from elementtree import ET
        
    from boto.connection import AWSQueryConnection
    
    AWS_ACCESS_KEY_ID = 'YOUR ACCESS KEY'
    AWS_ASSOCIATE_TAG = 'YOUR TAG'
    AWS_SECRET_ACCESS_KEY = 'YOUR SECRET KEY'
    
    def amazon_top_for_category(browseNodeId):
        aws_conn = AWSQueryConnection(
            aws_access_key_id=AWS_ACCESS_KEY_ID,
            aws_secret_access_key=AWS_SECRET_ACCESS_KEY, is_secure=False,
            host='ecs.amazonaws.com')
        aws_conn.SignatureVersion = '2'
        params = dict(
            Service='AWSECommerceService',
            Version='2009-07-01',
            SignatureVersion=aws_conn.SignatureVersion,
            AWSAccessKeyId=AWS_ACCESS_KEY_ID,
            AssociateTag=AWS_ASSOCIATE_TAG,
            Operation='ItemSearch',
            BrowseNode=browseNodeId,
            SearchIndex='Books',
            ResponseGroup='ItemAttributes,EditorialReview',
            Order='salesrank',
            Timestamp=time.strftime("%Y-%m-%dT%H:%M:%S", time.gmtime()))
        verb = 'GET'
        path = '/onca/xml'
        qs, signature = aws_conn.get_signature(params, verb, path)
        qs = path + '?' + qs + '&Signature=' + urllib.quote(signature)
        response = aws_conn._mexe(verb, qs, None, headers={})
        tree = ET.fromstring(response.read())
        
        NS = tree.tag.split('}')[0][1:]
    
        for item in tree.find('{%s}Items'%NS).findall('{%s}Item'%NS):
            title = item.find('{%s}ItemAttributes'%NS).find('{%s}Title'%NS).text
            author = item.find('{%s}ItemAttributes'%NS).find('{%s}Author'%NS).text
            print title, 'by', author
    
    if __name__ == '__main__':
        amazon_top_for_category(1000) #Amazon category number for US Books
    
  • Get Your ClickBank Transactions Into Sqlite With Python

    ClickBankClickbank is an amazing service that allows anyone to easily to either as a publisher create and sell information products or as an advertiser sell other peoples products for a commission. Clickbank handles the credit card transactions, and refunds while affiliates can earn as much as 90% of the price of the products as commission. It’s a pretty easy to use system and I have used it both as a publisher and as an affiliate to make significant amounts of money online.

    The script I have today is a Python program that uses Clickbank’s REST API to download the latest transactions for your affiliate IDs and stuffs the data into a database.

    The reason for doing this is that it keeps the data in your control and allows you to more easily see all of the transactions for all your accounts in one place without having to go to clickbank.com and log in to your accounts constantly. I’m going to be including this data in my Business Intelligence Dashboard Application

    One of the new things I did while writing this script was made use of SQLAlchemy to abstract the database. This means that it should be trivial to convert it over to use MySQL – just change the connection string.

    Also you should note that to use this script you’ll need to get the “Clerk API Key” and the “Developer API Key” from your Clickbank account. To generate those keys go to the Account Settings tab from the account dashboard. If you have more than one affiliate ID then you’ll need one Clerk API Key per affiliate ID.

    This is the biggest script I have shared on this site yet. I hope someone finds it useful.

    Here’s the code:

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    # (C) 2009 HalOtis Marketing
    # written by Matt Warren
    # http://halotis.com/
    
    import csv
    import httplib
    import logging
    
    from sqlalchemy import Table, Column, Integer, String, MetaData, Date, DateTime, Float
    from sqlalchemy.schema import UniqueConstraint
    from sqlalchemy.ext.declarative import declarative_base
    from sqlalchemy import create_engine
    from sqlalchemy.orm import sessionmaker
    
    LOG_FILENAME = 'ClickbankLoader.log'
    logging.basicConfig(filename=LOG_FILENAME,level=logging.DEBUG,filemode='w')
    
    #generate these keys in the Account Settings area of ClickBank when you log in.
    ACCOUNTS = [{'account':'YOUR_AFFILIATE_ID',  'API_key': 'YOUR_API_KEY' },]
    DEV_API_KEY = 'YOUR_DEV_KEY'
    
    CONNSTRING='sqlite:///clickbank_stats.sqlite'
    
    Base = declarative_base()
    class ClickBankList(Base):
        __tablename__ = 'clickbanklist'
        __table_args__ = (UniqueConstraint('date','receipt','item'),{})
    
        id                 = Column(Integer, primary_key=True)
        account            = Column(String)
        processedPayments  = Column(Integer)
        status             = Column(String)
        futurePayments     = Column(Integer)
        firstName          = Column(String)
        state              = Column(String)
        promo              = Column(String)
        country            = Column(String)
        receipt            = Column(String)
        pmtType            = Column(String)
        site               = Column(String)
        currency           = Column(String)
        item               = Column(String)
        amount             = Column(Float)
        txnType            = Column(String)
        affi               = Column(String)
        lastName           = Column(String)
        date               = Column(DateTime)
        rebillAmount       = Column(Float)
        nextPaymentDate    = Column(DateTime)
        email              = Column(String)
        
        format = '%Y-%m-%dT%H:%M:%S'
        
        def __init__(self, account, processedPayments, status, futurePayments, firstName, state, promo, country, receipt, pmtType, site, currency, item, amount , txnType, affi, lastName, date, rebillAmount, nextPaymentDate, email):
            self.account            = account
            if processedPayments != '':
            	self.processedPayments  = processedPayments
            self.status             = status
            if futurePayments != '':
                self.futurePayments     = futurePayments
            self.firstName          = firstName
            self.state              = state
            self.promo              = promo
            self.country            = country
            self.receipt            = receipt
            self.pmtType            = pmtType
            self.site               = site
            self.currency           = currency
            self.item               = item
            if amount != '':
            	self.amount             = amount 
            self.txnType            = txnType
            self.affi               = affi
            self.lastName           = lastName
            self.date               = datetime.strptime(date[:19], self.format)
            if rebillAmount != '':
            	self.rebillAmount       = rebillAmount
            if nextPaymentDate != '':
            	self.nextPaymentDate    = datetime.strptime(nextPaymentDate[:19], self.format)
            self.email              = email
    
        def __repr__(self):
            return "" % (self.account, self.date, self.receipt, self.item)
    
    def get_clickbank_list(API_key, DEV_key):
        conn = httplib.HTTPSConnection('api.clickbank.com')
        conn.putrequest('GET', '/rest/1.0/orders/list')
        conn.putheader("Accept", 'text/csv')
        conn.putheader("Authorization", DEV_key+':'+API_key)
        conn.endheaders()
        response = conn.getresponse()
        
        if response.status != 200:
            logging.error('HTTP error %s' % response)
            raise Exception(response)
        
        csv_data = response.read()
        
        return csv_data
    
    def load_clickbanklist(csv_data, account, dbconnection=CONNSTRING, echo=False):
        engine = create_engine(dbconnection, echo=echo)
    
        metadata = Base.metadata
        metadata.create_all(engine) 
    
        Session = sessionmaker(bind=engine)
        session = Session()
    
        data = csv.DictReader(iter(csv_data.split('\n')))
    
        for d in data:
            item = ClickBankList(account, **d)
            #check for duplicates before inserting
            checkitem = session.query(ClickBankList).filter_by(date=item.date, receipt=item.receipt, item=item.item).all()
        
            if not checkitem:
                logging.info('inserting new transaction %s' % item)
                session.add(item)
    
        session.commit()
        
    if  __name__=='__main__':
        try:
            for account in ACCOUNTS:
                csv_data = get_clickbank_list(account['API_key'], DEV_API_KEY)
                load_clickbanklist(csv_data, account['account'])
        except:
            logging.exception('Crashed')