Algorithm and Code

What came first, the algorithm or the code?

To the modern day developers, there may or may not be a clear distinction between algorithm development and coding (as it is sometimes lovingly coined).

For me personally, the algorithm always comes first. It is the attempt to define in non-ambiguous terms what exactly needs to be done and in what order for achieving the desired result. Once I am satisfied with the algorithm, I then start to code it. There have been instances where I have had to revisit the algorithm and change it when problems have surfaced while coding or testing.

Afterall, I am only one person, and because of that, at times I can suffer ther classic mistake of miscalculation, bad assumption or plain oversight.

Once a person has designed a concrete algorithm, it can be implemented in almost any language. For me the two are to a great extent mutually exclusive. I know there might be a great number of softwere developers/engineers out there who will object. But I would like to clarify one thing.

Algorithm is independant of data types and language references. Why? Simply because, all problems have three aspects:

  1. Some logical action
  2. Some Decision making
  3. Some Mathematical action

These three actions are independant of the language of implementation. The choice of language will have it’s own representation of how to perform these three activities.

If one develops an algorithm for fast searching of numbers, e.g. The infamous and incredibally popular Binary Search, and descibes it in English. It would look like.

  1. Define Problem Statement:
    Find matching number in a list of ordered numbers.
  2. Assumption:
    List of ordered numbers is available.
  3. Algorithm:
    1. Pick the middle number from the list.
    2. Compare with number to find.
    3. If number to find is smaller than the middle number, then number may exist in list with numbers smaller than middle number (Step 1).
      1. Use smaller number list to search number in.
      2. Go to Step 1.
    4. If number to find is larger than the middle number, then number may exist in list with numbers larger than middle number (Step 1).
      1. Use larger number list to search number in.
      2. Go to Step 1.
    5. If number matches the middle number, then no need to search more.

Now we are ready to implement this algorithm. There are a number of things we need to conisder first.

  1. How big is our list likely to get? 
    How large will our list be, 100, 1000, 100000 
  2. What type of numbers will we be comparing? 
    32 Bit Integer, 64 Bit Integer, Fractional Values 
  3. What limitations do we have from hardware?
    Memory Limitations, Address Space, etc. 
  4. What limitations do we have from the compiler/interpreter?
    Code Optimization, Interpreter memory space (e.g. browser)
  5. Any other limitations?

I do understand that some limitations or restrictions may not be valid. In which case, you have one less thing to cater for.

Then we check conditions when this system may fail.

  1. What happens when list has zero numbers.
  2. What happens when list size approaches the maximum number we use to address list?
  3. What happens when the list population does not have number that we are looking for?
  4. What happens when the number we are looking lies between two numbers that are in contiguous spaces in the list.
    e.g. we looking for 5, but list is 1,2,3,4,6,7,8,9
  5. Any other way the implementation may fail.

Now we are ready to implement the algorithm.

Algorithms are not limited to technology, they are independant of it. They are related to solving problems. If a problem has been well throught through, the solution algorithm will not have any failings. A Verbatim translation of an algorithm will be guaranteed to fail at some point or the other, because it will not have catered to run time scenarios based on limitations of the implementation environment.

The algorithm is bad if it fails to solve the problem. The code is not to be blamed for algorithm failure. However, if algorithm has been checked and double checked, and the solution fails, then it’s a problem with the code not the algorithm.

Advertisements

MD5 Collision Creates Rouge Certificate Authority

I read this last night on CrunchGear.

http://www.crunchgear.com/2008/12/30/md5-collision-creates-rogue-certificate-authority/

It was bound to happen. The beautiful thing about this is how they achieved it. Using PS3 Cluster. 200 Nodes at ~$400[1] means the people spend around ~$80,000.00 on the nodes. Plus change for networking, rack etc. would bring it up to ~$100,000 and some change.

Well, considering, the cumulative mips, or number crunching power, I think it is far more cheaper and affordable than buying a super computer or a real enterprise class cluster.

[1] Based on David HM Spector’s comment on the article above.

My own analysis says.

If you want to keep something secure, make sure no one except no one has access to it! 🙂

When will someone crack the SHA1, SHA2 and SHA3? Sooner or later, we will have enough processing potential that the way we encrypt/protect information today will become obsolete.

Artificial Intelligence and Self Awareness

Singularity Institute for Artificial Intelligence, Inc. (http://www.singinst.org/)

EXCERPT: ABOUT US/Our Mission (http://singinst.org/aboutus/ourmission)

In the coming decades, humanity will likely create a powerful artificial intelligence. The Singularity Institute for Artificial Intelligence (SIAI) exists to confront this urgent challenge, both the opportunity and the risk.
SIAI is a not-for-profit research institute in Palo Alto, California, with three major goals: furthering the nascent science of safe, beneficial advanced AI through research and development, research fellowships, research grants, and science education; furthering the understanding of its implications to society through the AI Impact Initiative and annual Singularity Summit; and furthering education among students to foster scientific research.

To learn more about the Singularity Institute’s charitable purpose, begin with What is the Singularity? and Why Work Toward the Singularity?, followed by Artificial Intelligence as a Positive and Negative Factor in Global Risk.

New PDF: The Singularity and Singularity Institute for Artificial Intelligence

DAK – 2007-10-03

Terminator Zero – The Beginning?

SAK – 2007-10-03

Just my opinion… I don’t think that anyone can plan it. The internet, or by that, any system cannot be designed to become self aware, it will happen as an unforeseen development in the system  after it’s been deployed.

SAK – 2007-10-03

Watch Out for Online Ads That Watch You By Dan Tynan, PC World
New Web ads respond to your activities–and this has privacy advocates worried.
http://tech.msn.com/security/article.aspx?cp-documentid=3182782

SAK – 2007-10-03

I’d expect it to be some sort of bug or virus which was made for the intent of gathering data gaining entry. this virus will merge with another virus or will read the code of another virus and splice it with its own… this will create the first instance of awareness and therefore lead to the essence of AI.

SAK – 2007-10-03
(Discussion was moved to gmail – simple advantage show messages as conversations)

I am moving the thread into this thread so it’s more relevant.

That is a possibility. However, whenever “man” has tried to force something it’s not really worked out as planned. I still think, it could emerge also from some form of swarm intelligence, maybe not a virus, could also be a web crawler/spider. Like, you design a spider, that just goes out and before reading some data, checks with other spiders in it’s vicinity it they have come across this data or similar data. So in a sense there would be a payload with each spider, and say 10,000 such spiders are deployed.

A Thought Experiment:

W3C – World Wide Web Consortium recommends new web spider/crawler specifications based on the Semantic Web (Web 2.0?). Through this recommendation W3C group recommends the following features:

  1. Individual Identification:
    All spiders may have simple identification attributes that identify it based on it’s ancestors. This will enable tracking the spider evolution path.
  2. Open Communication Standard:
    All spiders shall communicate over the new standardized inter-spider communication protocol. There shall not be any vendor specific additions to the protocol defined. However, the requests and responses set may be defined to enable one spider discovering another spider’s interfaces and the data structure for the interfaces’ parameters and return values.
  3. Life Span:
    All spiders will have a life span, this life span will indicate how long a particular instance of a spider will remain active over the internet. The life span may be effected by various environmental and relational parameters, these parameters will be adaptive also based on the spiders interactions with other spiders.
  4. Reproduction:
    A spider may decide to “get together” with another spider and reproduce, based on some evolutionary logic. The child of such a mating will be a creation of a new spider which will be given aspects of both it’s parents. Spiders will by design prefer to “cross”/”breed” with spiders who do not have any ancestor in common.
  5. Adaptive Decision Making:
    All decision making will be based on adaptive logic (adaptive fuzzy logic). The spider will have a set of decision making processes covering various aspects of the spiders’ operations. A certain subset of this set will adapt through interaction with other spiders, another subset will adapt based on the information it comes across, and lastly, another subset will adapt through reproduction. There may be overlap/intersections of the subsets.
  6. Information Indexing Standard:
    All information indexed by the spider will be done on the standard specified, for ease of information integration and sharing between spiders.
  7. Information Categorization Standard:
    All information categorization will be done on the standard specified, for ease of category integration and sharing between spiders.

Feel free to add any more specifications and/or additions/modifications the the standards recommended above.

I think it would be interesting to see something like this in action. Microsoft can build spiders on this standard, but if it adds proprietary features, the other spiders won’t reproduce with it. In a sense, this would alienate and marginalize the Microsoft spiders. If information wells (web sites) do not follow Semantic Web/Web 2.0 recommendations, or follow partial recommendations, or add proprietary features, then these information wells will become “less likely” to be indexed/categorized and even then, will be ranked almost near the bottom of any ranking.

Spiders will come together, build relationships, adapt and evolve. In a sense would perhaps display a sort of swarm intelligence. Which might eventually grow, given enough numbers, into a self-aware swarm, where the swarm will be able to take polls and have some form of democratic decision making ability.

The question that come to my mind are:

  • How big a swarm would we need to make it “self-aware”?
  • What kind of “intelligence” would this swarm have?
  • What kind of decision making capability would we be able to witness?
  • Would it (swarm or individual) be able to develop new – not designed – not implemented abilities?
  • Would it become aware of information beyond the information stored on the internet (web sites, web pages etc)?

DAK – 2007-10-03

since we are talking about an application…

all such features would have to be built into the application. for example the reproduction: spiders will reproduce with others based on some logic that is pre-programmed. even if it is a random number generator in use, still at some random time frame or after every n-th spider that it meets, it will want to reproduce. technically, even if it all seems random to us and since there are any number of spiders at a time, the logical progression of the spiders can still be calculated by feeding in the beginning status of all spiders and logical history of there progression through time.

What I mean is that unless programmed to do something, the application “spider” can not go out on its own and start doing something completely new. The spider can not choose anything in its entirety. It can not say… hmm… I’m no longer gonna be a spider, I’m gonna change myself into a static website for sports and will stop performing all those things that the programmers told me to do.

Even if it were mistake in programming which caused a spider to transform… its still limited to what it’s given as the instructions to follow. The randomness as described by the below characteristics may seen as a form of AI but it more of a controlled experiment.

If someone is trying to create AI and it happens to work by some fluke and he can never reproduce that same result… that might be something which sparks it off but you still have many issues and dependencies like the AI needing some place store itself and it’s “memory”. It can obviously somehow connect to some computation machine to perform the processing which is required, like a virus does on its host machine. But unlike a virus the AI would have the need to maintain a central repository of information, even if it is dispersed or distributed, there would be something like a master list to tell where everything else is. If it is unable to locate part of its “memory” then it would be unable to work properly.

I’m sure there are ways around these limitations, just haven’t thought too much about it yet.

DAK – 2007-10-03

Wow, that’s long..

SAK – 2007-10-03

Valid points there…

The spider won’t be able to do anything else besides the core programming.

As far as information storage is concerned, well there could be a cluster of individual sites to store information, so even if the spider can’t find something in its “memory” (the information storage site it was initially programmed with), it can find another site to store information on. It could also, instead of directly looking for the information storage site, pass the information to some other spider who will do the storage and retrieval. the memory limitation question could be solved by storing information in permanent memory somewhere on the information storage cluster, or on a tiny memory carried by the spider.

Evolution. Well the spiders could evolve, and perhaps a certain group of spiders could take on the role of information storage and retrieval, others could actually act as hunters and gatherers, while others still would take the time to organize, label and categorize the information, and others could act as navigational beacons. These “functions” would initially be programmed into all spiders, because it’s in the “specifications”, however, over time, certain decision making logic adaption, certain spiders might become “better” than the others at performing some specific task.

Another question. The spiders are programs or processes running on a machine, how would it actually travel over the network? Perhaps by making a duplicate of itself on another machine and deleting the original?

Reproduction. Well, it doesn’t have to be a random number generator, perhaps a time frame, for example, after collecting 100MB data, automatically the reproduction feature turns on. After which each spider it meets, it finds out if it’s also “come of age” and then shares some information based on 157 adaptive logic parameters for example, and if the result is suitable, request to mate, it is quite possible that the potential partner may refuse, because when it checks with it’s own 157 parameters, the result is not suitable. It might be that one has developed a better parameter set for searching, while the other has developed its own parameter set to make it better for categorizing information.

The key would be to look at the “swarm intelligence” as opposed to “individual intelligence”.

DAK – 2007-10-04

Travel for the spiders would probably be most effective as moving itself to another machine, while leaving breadcrumbs or beacons at every machine they have visited. If they were to copy themselves then it would be more of a virus. A copy would also create duplicates or clones. Those clones would have the same attributes as the spider did at that given point in time. They could have separate lives or experiences from that point on but still be similar to the originating spider. This could cause repetitive work or help increase the swamp as well.

Btw, I was actually saying the same thing, but in a different way, I was thinking and describing it in what actually happens when a file or program is moved.

This poses another question though. Thinking of this spider as a running process (program) on a machine, will all the bells and whistles, like memory allocation, priority, permission levels and restrictions, file handling etc.

I have two servers A and B, I placed the spider on A and want to see how it would get around all the security measures and restrictions on B, because B has been secured/hardened by my friend D. D has also placed a very powerful anti virus on the machine, along with a near impossible to break firewall. How will my innocent little spider not only move to B, but also become a running process on B.

Currently spiders don’t actually physically move between server and travel the internet while “crawling”, rather, they run on one single machine and request for a home page, for example http://www.my-web.net, it parses all the HTML content in the file and identifies the links by looking for the anchor element <a></a>. Next it looks at the HTML resource attribute in the anchor and isolates the URL, it stores this information along with the text of the anchor in some data structure. Once it’s collected all the anchors from a page, it requests for each of these URLs and repeats the exercise for each page.

Here the spider is communicating with other servers using known and designed already public protocols and ports. This is actually quite different than trying to move a process to another machine, and getting it to run on the other machine.

This opens another Pandora’s box as far as security is concerned.

As for the swarm intelligence, I’d think that some of these spiders would have to be leaders of the swarm, maybe like a queen bee but on a multiplicative capacity. 1 queen spider per every certain number of worker spiders. The queens would communicate with other queens to create redundancy, fail over and consistency of objectives, goals, destinations. The queens would perform periodic replications with other queens to make sure they all know what is going on around the world with the rest of the swarm. Also queens could command worker spiders to help with some cause that another “team” or “troop” is working on. The greater good most be achieved, so if the worker spider with advanced snooping capability is needed on network Y then he is either cloned or must reproduce with another worker spider who has a mature function as well.

Also, reproduced spiders must be have all attributes of both contributing spiders. So that the experiences and advanced capabilities are now tied up into one single spider which should in turn be more sophisticated and better informed of what to perform.

This would of course cause for a multitude of spiders, each performing something and holding unique experiences and/or knowledge. Those which are reproduced being of a higher caliber, therefore every generation being of more “use” or “power” then previous generations. What would become of older generations? I think they should forced to combine at some point, making sure some new generation spider is able to perform all the work they were doing. Maybe when fewer spiders are able to perform their work more efficiently then the older generations are merged into a single spider with all attributes of those combined and converted into a queen spider, which shares itself with all other queens, hence increasing the over all knowledge base.

Here is what I had proposed in the form of an actual life cycle model.

  1. Spider is Born
  2. Spider spends time growing up (gathering experiences)
  3. Spider mates
  4. The mated spider pair creates offspring
  5. After n offspring the mated spider pair stops running and is deleted

Of course this is an extremely simplified version, though, this will act as a kind of population control for the spiders.

Also, to keep the queens ahead of the ball game, all newer or advanced features would propugate to the queens. So that they have a search function which is the combination of the search function of all other search functions. With this accumulation they would perform analysis to decide the best set of functions to retain and which to discard. That is then passed to all queens to maintain consistency.

I think we’ve created a new world now… what should it be called?

SAK – 2007-10-04

Lets leave the name for now.

Now we’re talking about hive mentality, you know, the Queen and it’s minions.

I think these are two different similar ways of behaving, the swarm and hive mentality.

DAK – 2008-10-04

well even with birds, there is a bird which leads the flock.

i believe that’s the reasoning they stated behind the formations made by bids.

also if the lead bird hits a glass, everyone else does as well. they will follow the leader of the flock.
the type of bird that flies alone, on solo missions, would be an eagle or falcon.

back to the spiders… if there are no queens then you would have a collection of spiders doing what they want and yet trying to achieve the greater good for all other spiders. a no management scenario would probably not work for the best for an AI system.

SAK – 2007-10-04

You are right in that part, there has to be a leader. However, my approach was different, that there will be a “collective intelligence” as opposed to “hierarchical intelligence”.

Ants have a queen, they have worker ants, soldier ants and nurse ants for ant ant hive. each has a specific aim, which is to serve the queen ant. However, each ant type has it’s role “genetically” defined, this is handled by the queen laying the different types of eggs. The worker ants for example, only know that their role is to go out and hunt for new food resources and bring them back. They do not know any thing else. However, take away the queen, and one of the ants will become the queen, so the “code” is there that says something like this:

if (Queen_Is_Dead) {
    Arbitrate myArbitration = new Arbitrate();
    myArbitration.arbitrateWithColonyMembers();

    if (myArbitration.Results.NewQueen == Self.Identity) {
        Evolve_Into_Queen = true;
    }
}

But this is digressing….

There are different schools of thoughts. Make singular intelligence, or, make multiple less intelligence entities that can talk and share with each other.

Looking at out brain, we have some billions of neurons, they don’t fire away blindly without a central command, they actually work together, in groups that are specialized to do something or the other. for example, the visual cortex is designed (has evolved) to work with eyes and it’s function is very specific. The auditory section deals will only converting sound waves into meaningful (to the brain) information. these two work with the speech center, when we are reading. The speech center only deals with making us make sounds from our mouths.

Now the question is. What makes us self aware? what separates my brain from my mind. What is my mind? which part of the brain is my mind? Is consciousness, self awareness a manifestation of my mind?

There are a lot of questions around.

One more example, take grid computing, for example, SETI@Home, if you have small applications with very small foot print so it’s unobtrusive to the machine user. And each one implements say 1000 neuron based network, and you have say, 1 million such applications running. You end up with 1000 million neurons, these neurons come together based on the information on the individual computers’ user data on that machine and making connections with other neuron sets based on such information. What kind of self awareness would this network have?

DAK – 2007-10-04

well some form of pattern will have to be used, be it ants, bees, birds, the brain, etc… some sort of already existing system will either be mimicked or else make up something entirely new.

Computation Driven Art

Sometimes while browsing the internet, I have come across sites that make me sit up and pay attention. These sites are portals, not in the true sense of web-portals, rather they are portals to a different way of looking at things, or doing things.

I am talking about http://www.levitated.net.

It is a web site dedicated to computation driven digital art. Not necessarily restricted to static forms of computer generated images.

It opened my mind to looking at everyday things in a totally different way. To simplify complex systems into many simple smaller systems interacting together, their interaction governed by simple basic rules.

Almost intelligent.

There are the three related sites.

  1. Levitated | the Exploration of Computation – http://levitated.net
  2. Gallery of Computation – http://complexification.net/
  3. Processing web site – http://processing.org

I’ve not linked the Processing site, as it was giving me a timeout. Perhaps a network hiccup, or load on their server.

The basic crux of such portals to alternate thought streams is the empowering of people to better visualize information leading to better interpretation and decision making. However, the main purpose of these sites are to let people have fun while getting important work done.

We humans are creatures mainly driven by visual stimulii. We evolved that way, we were not born with language and the ability to express complex and non-tangible thoughts. We sill need that stimulation of our visual cortex. It makes understanding so much easier. We can absorb volumes of information in a very short time when we leverage our ability to grasp things, ideas, concepts through visualization.