[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: (null)







(This is a reply to an email sent to me privately, but I'm copying
the response to the list because I feel it is interesting for infobot
development, especially since it refers to the devel series.)


Rocco Caputo writes:
> If geckobot's using POE, the Filter::Reference will let you pass
> around Perl structures.  It freezes them on the sender's side and
> thaws them on the receiver's.  Send syntax is basically:
>  $wheel->put(\@thing)
> On the receiving end, you get an event that says "you got this
> thing, \@thing".

Eee. Um. That's nice, but... isn't it a little trusting? I mean,
is there some negotiation first, or do you indiscriminately receive
anything sent to you? Presumably you'll use the received thing as a
Perl object, and maybe execute methods on it, and the idea of
running arbitrary untrusted subs is rather scary for me. Sure, it's
fine if we're just passing around factoids and meta-data, but once
you get code coming across, I wouldn't touch it with a big stick.

> Artur/Sky has a patch pending that lets it use Compress::Zlib
> transparently, so network traffic can be balanced against CPU.

Oh, wow. That's special.

> On the decentralized side, factoids could be propagated like
> Usenet news messages.  Factoids won't get lost if there are
> redundant paths.

OK, let's take this principle and run with it. Here's my complete
train of thought on how interbot is going to work if we go with a
peer-to-peer network. How you traverse it is up to you, but I've tried
to make it a DFA. :)

i) Infobots have a set of peers. If we're really lucky, these peers
form a k-connected graph. ->ii,iii,v

ii) How do you discover how your peers are? Either we hard code in
config files, or we automatically discover. If we're going to discover,
which I'd prefer because `discovery' is my programming buzzword of
the moment, we'll need some sort of central registration scheme
for infobots. You choose your peers by who's networkologically
close to you. If you're going to go to a central registry, though,
you might want to ditch the idea and go with something either
massively-connected (all bots connect to all other bots) or maybe
a star-based topology where everone connects to a MetaBot which
forwards and proxies (and caches, yes!) queries. You also lose
the distributed nature of the network to a single point of failure.
Or should there be multiple proxy bots? How many levels can you go
that way? -> xii

iii) Are peerings bi-directional? Usenet feeds are (generally)
two-way; I can receive news from my peers, and I post news to
them as well. I'm named in their config files, and they're named
in mine. -> iv

iv) Does an information request from a bot need to be any different
from an information request from a human? -> viii

v) A request for information goes out to all peers, each of which
replies if they know the answer. ->vi,vii

vi) We may receive two conflicting answers. Do we simply take the
first one? Or can we invent a way of validating which is more
appropriate? If so, do we keep both factoids anyway? In short,
how are different factoids going to co-exist in a distributed
knowledge base? -> xii

vii) How do we propagate the search further? We need a sense of
a return Path and a way to terminate the search if we've found
the information. Usenet provides a `cancel' control message to
do this, effectively. But where do we store state data about a
query? -> viii, x

viii) Ideally we'd have a response look the same at every level,
whether initiated by a human, a bot or another bot further
down the chain, just to make implementation nice and easy. Why
should the bot have to care whether it's talking to me or to
another bot? So, we'd have a message ID stored *locally* keying
what went to whom and who it's for. -> ix)

ix) However, we can't sent the message ID out if we want it to
be transparent, since I'm a human and I wouldn't send out a
message ID with my queries. So, we have a situation like this:

     Human: BotA, tell me about hoge

BotA - need to tell Human about hoge, add to to-do list.
don't know about hoge. ask peers, [and this is the key!]
go back to the event loop.

     BotA: Human, sorry, don't know. I'll see if I can find out.

     BotA: BotB, tell me about hoge
     BotA: BotC, tell me about hoge
     BotA: BotD, tell me about hoge

BotB - need to tell BotA about hoge. don't know about hoge.
ask peers.
....

BotF - need to tell BotC about hoge.

     BotF: BotC, hoge is the Japanese for foo.

BotC - recieved new factoid `hoge'. Checking to-do list.
Have to tell BotA about `hoge'. Clear `hoge'->BotA from
to-do list.

     BotC: BotA, hoge is the Japanese for foo.

BotA - recieved new factoid `hoge'. Checking to-do list.
Have to tell Human about `hoge'. Clear `hoge'->Human from
to-do list.

     BotA: Human, hoge is the Japanese for foo.

[Hence, there was no need to keep state data. However... -> x]

x) So, we can reliably pass information between bots and humans
without keeping any state *in the message* or distinguishing between
human queries, bot queries and second (n'th) -level bot queries.
BUT... if we have a k-connected graph, which is nice for redundancy,
yet we're not keeping track of who's asked who what, how on earth
do we exhaust the search? -> xi

xi) I guess one solution may be, when receiving the `will try
to find out' message, check whether we have this from all peers,
and if so report `none of my peers know about this'. This will
prune that particular branch of the search. But is that now
guaranteed to find-or-fail in all circumstances? (I can't get my
head around the topology of this) And how to we code it into an
event loop? -> xii

xii) Don't know.

> I have too many ideas and not enough implementation.

You and me both.

Simon