This is the second part of the notes collection. Martin Mroz formatted the first part with word, up until January 26, 2002.
January 29, 2002
I found that I still have a problem with OBJECTs and spkrsays or addtoqs. Particularly, things that shouldn’t be saved by addtocontext are being saved. Here is the sequence:
the room is cheap.
how much?
is the room expensive?
(maybe). "Speaker asks how expensive is the inexpensive room" is creating a copy of room with an orthogonality violation. I think that I need to tag the OBJECT so that I can identify it later. Hence, use of an auxiliary tag, AUXTAG4.
January 30, 2002
I am looking at an issue with simple-statement-17, wherein the verb "tobe" isn’t promoted. When used in an inference template, it means that inference must match the TNP of the statement to evaluate to TRUE. Take, for example:
if i ask how much is a room then tell me what a room costs.
I want this to apply to "how much is a room" as well as "how much are rooms." The verb tense necessarily should match. The person matches. The number is an issue though: singular in the template and plural in the question. How do I make concessions for the number without kludging?
February 15, 2002
Been very busy insode and outside of work. Here’s a summary for a show last week in Scottsdale:
I went to the last evening and final day of the "Telephony Voice User Interface Conference" in Scottsdale, earlier this week. We had a table for the demo and discussion day. Briefly, it went very well. The attendees were very knowledgeable. Some folks sought us out, having heard about Brainhat already. There were a few serendipitous discussions too. I got a lot of "Wow! That’s cool!" reactions. One fellow had just come out of a talk on NLP where the stuff I was demonstrating was painted as "the future." :-) I spent a good deal of time with the product manager for Rhetorical, the company that sells the "Alisha" voice Rich talked to in the demo. Frankly, the discussion was about exploiting the entertainment possibilities of Brainhat + a soft voice. I’ve been trading mail with the woman since I got back. I also had 15 or so beers with a columnist for Speech Technology Magazine. I think it won’t take too much arm-twisting to get him to write about us. I also talked to people with money, people with applications. My only complaints were that 1) we should have been on the speaking agenda and 2) the last day of a conference is a difficult time to attract people to a trade show floor.
Bug time: "i like you", broken. Fixed.
Here’s a real puzzler... in the "why did it ever work" category. I say:
you have a room. the kitchen is pretty. do you have a room?
The trouble is that qacandsall trolls the context looking for "rooms" to fill the OBJECT slot in the question. The candidate, "kitchen", fits the bill and wins the contest against the other candidate, "room." The trouble is that Brainhat does not have a "kitchen."
qacandsall appears to be working better than it used to.
debug> dump define Root [19638] enable do-1 object kitchen-1-4a56 verb have-2 subject brainhat-1-49f7 define Root [19633] enable do-1 object room-1-49f9 verb have-2 subject brainhat-1-49f7 define Root [19628] enable do-1 object kitchen-1-4a56 verb have-1 subject brainhat-1-49f7 define Root [19623] enable do-1 object room-1-49f9 verb have-1 subject brainhat-1-49f7
The clip above shows the question "do you have a room" after the rule has matched, but before any mprocs have fooled with it. The two basic forms "kitchen" and "room" both deserve a chance to evaluate to "yes." The trouble is that chooseone picks the wrong one, and it continues on to declques2 to return an answer of "maybe."
I’d like to assume that I could get away with choosing the direct match to the word(s) passed into qacands. For instance, if I found "room" when I was looking for "room," I might just go with it. The danger is that "room" might not be the best candidate. Consider the following sequence:
You have a pretty room.
The kitchen is near the water.
Is the room near the water?
February 16, 2002
Perhaps the problem is that I am choosing a candidate question before I know what the answers might be. As with a person, Brainhat could be greedy about the potential answers and try to find a definitive "yes" or "no." Anyway, I am going to play with it....
That certainly fixed the problem. I removed an invocation of chooseone that precedes a pass through declques2, and came up with a couple of "yes" answers and an "i don’t know." One of the "yes" answers was chosen. I am going to look at all of the other questions, remove the early choices, and then see if the tests fail. A thought occurs to me: the cost of running multiple questions was much higher when inferences were exercised at question-time. Now they’re run speculatively.
February 17, 2002
Doing stuff! Looking at sent-desaction-1a to see if I really need it.
Two "desire" rules eliminated. I also added a "[to ]" into sent-action-10 to soak up the infinitive.
Five minutes later.... hmmmm, I put some rules back. There are some "desire" rules that depend on csubobj-stmt constructs. I have to check to see if I couldn’t accomplish everything I need to do using simple-statements.
Later.... Interesting case: I fixed declques2 the other day. And I moved chooseone to come after declques2 in most cases. The problem is that addtoqs needs to store the question before declques2 gets it, but this is also before chooseone gets a chance to reduce the number of questions to a single one.... I left off in question-action-2, thinking about this.
I still haven’t tackled the remainder of the "desires" patterns. I was working to make csubobj-statements input simple-statements (and it seemed to be working) when I ran into this little brainteaser.
February 18, 2002
The fix came to me lying in bed last night. Wonder what it was....
Oh yes.... I think I might want to push a copy of the CC containing the questions before I run declques2. If the answer comes back "empty" (which means "I don’t know"), then I can restore the chain, run it through chooseone and pass it up for storage by addtoqs. But what do I do if declques2 finds the answer? Need to do some experimentation....
The near-term answer is that the question is perserved if the answer is yes or no. Actually, it is worse than that. If I say "mario does not see the ball" and then ask "does mario see the ball?", the question gets recorded as "speaker asks does mario not see the ball?"
Perhaps I should have addtoqs run before attrques2 or declques2. It will further need to pick just the first element of the conchain.
Here’s the soul-bearing confession: the question that I save--the first one on the conchain--might not be the one that gets answered; it is possible that declques2 will like the 3rd or 4th question on the chain better (by answering "yes" or "no"), and so chooseone will make it the winner. The reason I think that I will get away with it is that the questions are merely being saved so that I can later posit things like: "do i ask if mario sees the ball?" The wrong interpretation of the question may be stored, but the answer will still come back "yes" because the "do i ask..." version of the question will also (probably) have a permutation that includes the wrong interpretation.
The only downside to this tomfoolery is (I hope) the possibility that the discourse buffer could get polluted with off-subject words. Anyway, I am going to try this and see what happens.
Rich and I have been fooling with a tree-view java widget. We’re doing some great stuff. But now, back to the question of whether I can make csubobj-stmts into simple-statements....
What’s the different if I use csubobj-ana versus csubobj-prep in a statement?
February 21, 2002
We had a meeting yesterday where we discussed the nature of the products we are cobbling together in terms of their definition, marketing and revenue potential. The group consited of me, Rich, John, Rick, Jack and Mike. Some fundamentatl changes in product definition were the result. It’s really terrific stuff, in my opinion.
The idea is that we recognize that we sell three product lines, plus support and professional services. The products are the core platform, scenarios and interfaces. The purchaser could mix and match these components to suit a need.
One of the most important observations is that we are recognizing scenarios as part of our intellectual propoerty. We are also abandoning the notion that scenario building is for everyone.
John was adamant that we also create a user community for development of applications (scenarios) by professional users of Brainhat. This is an idea that Mark offered some time ago as well.
Some big changes coming for Brainhat:
I need to make the input parsing more greedy. Currently, the program stops when a pattern matches. I need to have it continue, looking for more matches. In coordination, I need to have the input parsing routines be mindful of whether they have consumed all of the input, or whether some is leftover following the match. This will give me a sense of the quality of the match and possibly direct whether I bother to attempt sloppy matches. Routine bestfit already depends on knowing whether input is completely consumed. I’ll have to mesh with the existing code.
I want to disconnect Brainhat’s beliefs from what it hears the user say. This is necessary for correct processing of idioms. I experimented with it just before Christmas. In addition, I need to be able to correctly handle whole statements as template components in inferences. Take, for example, "if i say something then i mean something." I want "something" to stand in for a whole CC.
Memory management needs work. 1) I need to reclaim nodes more often--perhaps whenever chooseone gets called. 2) I need to be smarter about how I re-use the nodes that are reclaimed. The memory footprint, though large, should concentrate activity in localized regions so that we can avoid cache, page and DTLB faults.
Allocation, location and reclamation of memory re-consumables (concepts, etc) should be directed by hash tables whose layout reflects the linear memory map. When a node is allocated, it should be pulled, if possible, from memory space near to it so that there will be some locality of reference. If we can’t find any space that is "near" then we could look for space in neighboring buckets.
I suspect that the reclamation pattern in use now is returning recycled nodes from the end of the memory map.
Note to self: Look into the caching in place for the symbol table. I’m not sure that code is working at all.....
Yep: the memory allocation pattern is atrocious. Fixing it will be a good project, but first!--> I need to make stateless httpd brainhat.
Stateless httpd brainhat will require that I drop server-side cookies into a temporary directory. The "cookies" will include the context, discourse and question buffers(?).
February 22, 2002
I am on my way to NYC to meet with a woman from Rhetorical, the company that makes the designer voices we are planning to use with some of the Brainhat products. The subject at hand is the "relationship server," and the telephony, ASR, NLP and TTS challenges it poses. I have some hand-scratched notes about the technical challenges, projects and origanizational issues as I see them. The important thing is that any effort expended on the relationship server will serve to open up the dialog market in general.
Need a nap...
Later (on the way back). The meeting went well. It is 11:00 PM. I am lucky to have my computer at all, having left it in a bar and a restaurant tonight. The food was good. We agreed, in principle, to put together a demo and use that to help develop some cash.
February 24, 2002
I am working on being able to save state between runs. Here is the verbiage from the top of the file state.c, currently under construction:
To date, Brainhat daemons for VXML, HTML, text, have been stateful. For the VXML and HTML invocations, following the first contact with the daemon, subsequent connections are directed to another port where the stateful copy of Brainhat lays waiting. Statefulness is useful if there are back-end connections to other stateful things, such as robots or processes. But it carries the burden of a replicated memory footprint for each of the waiting copies and the inconvenience of managing accesses to other ports. The port redirection thing became particularly troublesome lately when we started playing with a JAVA plugin for the web set. The plug-in will allow the user to "see" the parsed output from Brainhat. The problem is that the redirection appears to violate JAVA runtime security for the applet, and hence no phrase tree pictures appear after the first connection. Anyway, the impetus for making a stateless Brainhat is getting stronger. It will also be useful for being able to save and re-invoke state between boring old command line sessions. I will need a few routines: 1) a routine to create clever file names for storing the state between Brainhat runs. The files that will be created are going to be the equivalent of server-side "cookies." 2) a routine to save the state of the context, discourse and saved_question buffers, plus anything that might be hanging around from ’bestfit,’ the routine that tries to improve speech recognition. 3) a routine to re-incarnate the state of the last session. This will basically read stuff back into the context without running any inferences.
Concepts can have multiple synonyms attached. The very first is always the definition name, as given in the "define" statement. To avoid ambiguity when reinitializing the context from a file, I would like to use the "define" names instead of any of the other synonyms. Since the same output routines that allow Brainhat to "speak" are being used to create a copy of the context, I am going to need a way to tell the "speak" routines to use the first synonym. Currently, the routines always choose the last. So, I am going to create a little routine that chooses a synonym from the text links in a concept instead. It will have a flag that says "take the last", "take the first", and "take one at random, excluding the first." We’ll see what happens; a little mixing of synonyms might add some color to Brainhat’s output.
(later) Part-way there. No I am going to work on something else for a while...
February 26, 2002
Mark Miller noticed that the following sequence is broken:
>> the red ball is round the ball is round. >> the blue ball is oval the ball is oval. >> what shape is the red toy the ball is blue is oval. >> what shape is the blue toy the ball is blue is oval.
March 1, 2002
I am working with Mark Miller on the last chapter of his book.
(later) I am grieving over the grammar. I need to be able to easily create and test grammar rules for any kind of input. To be complete, any utterance should be able to satisfy the following tests:
1) statement input, e.g.: "mario sees the ball",
2) embedded in a question about
the statement: "do i say that mario sees the
ball?"
3) output in a statement: "tell me that mario sees the ball."
4) as part of an inference:
"if i say [that] mario sees the ball then you
are happy."
Then there are questions:
1) "does mario see the ball?"
2) "do i ask [if|does] mario sees the ball?"
3) "ask me [if|does] mario see the ball"
4) "if i ask does mario see the ball then you are happy."
And there are "W" questions:
1) "what is in the room?"
2) "do I ask what is in the room?"
3) "ask me what is in the room."
4) "if is ask what is in the room then you are happy."
March 5, 2002
Back to thinking about
improvements in the grammar and
ways that I might be able to manufacture grammar rather
than slog through it by hand, like I do now....
I ran into trouble with
’what comes with a room’--part of Mark’s
book scenario.
I want to use this as a prototype for manufacturing grammar
for all possible uses.
Step 1) Make the baseline
question work:
"what is in a room?" (okay, covered by
question-what-8)
using a normal form of some sane construction.
define Root-491d [18717] label question-what-6 child-of things auxtag4 Root-491d subject things verb tobe attribute Root-4919 tense present number singular person third
This is taken from the context where "the speaker asks ’what comes with a room’" is recorded. It looks reasonable.
Step 2) Make "ask me what comes with a room" work. To do that, I need to have a simple statement that comes up with the same representation as above. In addition, I need to be able to express "what comes with a room" independently from all other possible interpretations. This probably means that I need to add an auxtag of some sort to the CC.
Perhaps the AUXTAGs should be more general. Perhaps there should be a single link type "auxtag" with a parameter that describes the kind of sentence we recognized. More globally, I am at the brink of thinking about voiced representations of ideas independent of their stored forms.
The sentence "ask me what comes with a room" does not work. I need to make a bit of "speak" grammar and a routine to process the output.
March 13, 2002
Rich and I have been working on the database/scenario issues. I am writing up a little description for the others in the project. I’ll paste it here....
Brainhat News -- 031302 Rich, John and I have been up to some very interesting stuff. I thought we should share it with you. The information in here is confidential. Database Brainhat currently reads vocabulary, grammars and scenario information from files at start-up. This is inflexible, at best. It also means that we have to drag into core everything we must possibly use during a conversation, which has storage and start-up cost implications. We have been working on a SQL database interface for Brainhat. It solves the start-up cost questions, and it will allow us to share information between Brainhats and across run. The database can store grammars and vocabularies for scenarios and sub-scenarios, and dynamically deliver them back to the Brainhat engine. The biggest impact, though, is that the database will allow us to provide Brainhat dynamic focus in the course of a conversation. An illustration: Imagine you are in the Louvre, looking at the Mona Lisa. Brainhat is there, acting as your tour guide. Three scenarios are active: a core scenario, the Mona Lisa scenario and the restrooms scenario. The current focus is the Mona Lisa scenario; the grammar, vocabulary, proposition and inference collections are appropriate to discussing the painting. Out of the blue, you say "I have to poop." Brainhat has no vocabulary for this is in the Mona Lisa collection, but it has appropriate inferences, vocabulary and grammar in the restroom collection. This will motivate a change in focus. Within Brainhat, this should be as straightforward as swaping clean concept symbol tables dynamically. The context of the conversation will be preserved across focus changes; you would be able to ask "does Mona Lisa have to poop?" The answer would be "maybe. you have to poop." You get the idea. GUI In the GUI, even in its current state, you can edit vocabulary, grammar and scenarios information for different foci. Just as one crafts a scenarios with inferences to move a conversation from point A to point B, the artform will be to craft scenarios and sub-scenarios that motivate focus changes as a conversation progresses. A couple months down the road, the GUI will be the preferred window into the database and Brainhat, and the database will be where all of the runtime data are stored. What we call "Brainhat" now will look more and more like an processing component of the bigger picture. Another element of the GUI is a set of services to allow one to monitor the activity of the database. Who is using the database, when did they start, when did they finish. What are the active scenarios a certain user is using, what is the scenario of focus... Internal Brainhat Changes to support Dialog Progression Brainhat currently processes input in chunks; whole utterances are presented and parsed against the collection of input grammar rules, one at a time. We will be making changes so that multiple grammar rules are processed at once, and so that we may take input in pieces, before it is completely delivered to the program. This will be necessary to make real interactive dialog possible. There will be challenges: such as detecting that something is wrong with the current ’best’ interpretation, where to back up to undo the earlier incorrect choice of interpretation, how to proceed from the point at which you make a different choice, how to determine if the new interpretation is better than the original. Grammar and Interpretation We are also working to make it easier to craft new grammar, which at this time can be pretty challenging. Consider that any construct you wish to parse has to be handled in multiple ways, and also has to be able to be "voiced" in a meaningful fashion. Take as an example: <subject> <action> <object> <adverb> which matches the utterance "mario sees the princess poorly." The grammar rule that accepts this must correctly recognize and label the input. But for this construct to be completely useful, we also need to handle: "if mario sees the princess poorly....." "if I say that mario see the princess poorly...." "tell me that mario sees the princess poorly..." "does mario see the princess poorly?" "do i ask if mario sees the princess poorly?" Streamlining grammar creation will allow us to enable existing code that permits Brainhat to interpret dialog input. Brainhat will need not believe what you say, nor tell you what it thinks. When I say "mario sees the princess poorly," the only fact that the program will record is "i say that mario sees the princess poorly." Consider that what I *say* may be subjective; Brainhat may wish to interpret what I say based upon a stored inference. The reason this change is important is that it will provide a mechanism for processing idiomatic input without also considering literal meaning, and it will allow Brainhat to speak idiomatically. It will also allow Brainhat to lie, or take the interlocutor to be a liar, which will be crucial for gaming or role playing. In less severe cases, it will allow Brainhat to speak in response to its goals, in lieu of its beliefs. And of course, there are other things in the works too... Any questions about anything?
Also, I have to make something like "move arm X inches in the X direction" work, for IAI. The "n inches in the X direction" will be an adverbial phrase. I need a number adjective to handle the "n."
March 13, 2002
I created a wild-card number attribute and some grammar to support the NASA project. You can say stuff like "robby move the arm 10 inches in the X direction."
I am throwing out the AUXTAGn collection in favor of a single tag, AUXTAG that refers to a concept. The concept can be anything, though by convention it might best be the rule where the auxtag was created. Again, the idea is that I want to be able to track the life of a CC from recognition through output.
The primitive will be:
auxtag concept
March 13, 2002
I’m working my way through the grammar. I am futzing with test cases for titular assignment. The one that zinged me was "the woman is a princess. is the woman a princess?" The answer came back "maybe" because the defined princess was already a woman, and to make the relationship work the other way around would cause a cycle. Routine title_assign refused.
o woman \ \ o \ \ o princess
Should title_assign reverse the positions of woman and princess? That would make any woman be a princess. I guess that would be the right thing to do in the presence of a proposition "all women are princesses."
But, "the woman is a princess" is a tricky one. I will a have to return to this.
March 22, 2002
I need a better comparison routine than vrfy_xchild. I ran into a problem with a comparison between "speaker says that mario does not see the ball" (the child) and "(does) speaker say that mario sees the ball" (the parent). The "not" is glued onto the root of the child, but comparisons are made from the parent’s point of view: do i find a child of each of my components in the child CC? Anyway, I am going to have to work up a kludge.
March 24, 2002
Interesting problem to find this late in the game.... processing templates for inferences, I have disabled qacands, checkcommand and inferences. But I forgot to limit compare from looking in the context buffer while processing inference templates. Is it enough to skip the context when NOQA is set to "true?" Probably....
March 30, 2002
I made a change to check_orth_things that could break something else. Particularly, I commented out the test to see if there was a pre-existing child/parent relationship. This was useless for test028.t, where "mario is the red block", followed by "luigi is the blue block."
Well... that caused a horrible problem. I guess I could set NOQA before running the grammar rules that do titular assignment...
April 2, 2002
I am grieving over how to do all-at-once parsing of input, and I suddenly think that it might be quite easy given the data structures I already have. I should simply have to merge all input rules into one meta rule. There’s more to it than that, of course. But that gets me a lot of the way there.
As for the problem I was having yesterday:
"I want to talk to
Kevin"
"I might want to talk to someone."
"Do I want to talk to someone?"
"Maybe...."
The issue is that "I might want to talk to someone" gets pulled from the context. "I might want to talk to kevin" doesn’t because ’kevin’ doesn’t get recognized as a candidate for the object position. This is because qacands and qacandsall are both avoiding the stuff in object roots. It might be hearsay, as in "the speaker thinks that the ball is blue." I don’t want "blue ball" becoming a candidate (I don’t think...).
Possible fixes:
1) make declques2.c more aggressive trying to find stuff in
context.
2) make qacands/qacandsall decide when it is safe to mine
the object roots for candidates.
April 3, 2002
"Mario was the king."
"What was Mario?"
"What was Mario?"
The above sequence breaks so horribly... don’t know why. The most peculiar thing is that it behaves better if I choose luigi instead of mario.
Hmmm.... a big clue: I changed
define ccsent label ccsent rule $r‘csent‘0[ $r‘csent‘1] map ROOT
to
define ccsent label ccsent rule $r‘csent‘0 map ROOT
April 5, 2002
I was finding suspicious stuff coming back from vrfy_child, and wierd things going into the negative cache for vrfy_child. I haven’t looked yet, but I think I can guess what was happening... Hmmm.... then again, maybe not.
The problem turns out to be that pullwhats makes copies of concepts with parents and children taken out individually so that chooseone can pick from among them. Once chooseone looks at a concept with missing parents, it gets removed as a candidate by the vcache. The right thing to do is shut off the negative caching inside vrfy_child when processing the results from pullwhats. Will do.
April 11, 2002
I was on CNNfn last night. It went pretty well. Rich has the database doing focus shifting in a pretty comprehensive fashion. Next, we are going to worry about how to load scenario information into the database.... notes follow.
(Later) Rich and I worked out some of the details about how we could store CCs in the database. I am looking at the notion of joining the context taken from the database to the running context in a temporary way. I don’t want all of the inferences a crap from the database to become part of the permanent collection. Rather, I want them to be available data for use by the rest of the program.
Stackable hash tables! I’ll tell you about them tomorrow!
April 12, 2002
Changed my mind. I am simply extending the notion of a hash table bucket to include a scenario ID. When new scenario data gets read into memory, I’ll hash it with the scenario ID. When it is time to more on to another scenario, I’ll go through and rip out anything with the existing ID number.
As for the context, I am going to build an intlink chain parallel to the context that keeps track of the scenario IDs. A special scenario ID, ’generic,’ will be for all things added by the program, and not sourced directly from the database.
Done. Now I need routines to add/subtract stuff from the hash tables and context. The routines will be:
addsctoctxt (con, scid)
rmscfrmctxt (scid)
The first routine will add concepts one at a time in some fashion (probably interleaved) with the existing context. It will create hash entries for the phash, chash, and whatever.
The second routine will strip all entries associated with ’scid’ from the context and clean up all corresponding hash table entries.
April 13, 2002
Okay, the first routine is done. Now I am grieving over the second routine. Particularly, I need to think about what to do with the hash table entries and context entries created at such time as the focus shifts away and the scenario becomes inactive. It is possible that it will become active again, perhaps soon....
Next things to look at: bestfit and memory management.
April 17, 2002
We are almost ready to try and resurrect CCs from the database.
The context will contain a mix of scenario ID "generic" (generated) stuff, plus stuff pulled in from the database. I need to think of a way to commit stuff that is learned from the available-but-not-permanent stuff in the context. Basically, I need a way to change the scenario ID of stuff hanging around in the context from whatever it is to SCID_GENERIC.
How will I know when to commit? A question is asked; the answer comes from the scenario-specific crap in the context. I need to recognize that the result of a question needs to be committed to the context in some cases. I will certainly need a new mproc routine like addtocontext, except that it stores only if the scid is not GENERIC. (or maybe I will used addtoctxt plus some kind of signal flag).
April 26, 2002
I am on a plane on my way to St. Thomas via San Juan. I just spent some time making repopulation of the context during a focus shift behave. More to do....
April 29, 2002
On the plane back...
I have been thinking about how to bust up processing so that the speech components don’t get tangled with the NLP components. The database already acts as its own program. It seems that the engine could stand-alone for use with text-only applications. Another program--a form of the engine--could communicate with speech engines to all-at-once parsing of input, make end-point decisions, and provide speculative hints to speech engines. The questions are where to break things up and how to coordinate the processing.
I could assume that the engine and this other component could share symbol tables and context through a shared memory space. The all-at-once parsing could contain tags and recovery histories so that the new process could provide a best guess stream of words and semantic symbols to the engine. I’m not sure where scenario shifting comes into play though. It will probably be too slow to help the speech engine.
All-at-once processing can be useful to the existing engine as well as to this new process.
Question: where does the current shifting take place? How will bestfit get incorporated?
On another note: I still haven’t figured out how to take a CC from the borrowed context of a scenario shift over to the working (0) context. I can easily add a flag into addtocontext to make it so that I can tell when I am being called from "questions." The trouble is that there is no way to tell that the thing I am being asked to add to context was the product of a question posed to a temporary piece of the context--a piece made current by a scenario shift.
Possible solutions: I could check the context to see if the fact that I have uncovered belongs to a non-zero scenario. The trouble is that I will now be checking the context for *every* answer generated. Or is that a good thing? I do have the context hash.... Hmmmm... Let me think about that. I could add an mproc that looks to see if the CC passed in is already in the context. If so, it could branch around a copy of addtocontext. Otherwise, it could get added to the context and to the discourse buffers. Or, the processing could be incorporated directly into addtocontext. Checking to see if a concept has already found a place in the context will keep me from repeatedly adding mundance facts....ala: "the ball is red. the ball is red..."
Did that. My input-patterns doesn’t seem to be able to override Rich’s input patterns. Why? Can’t test.
May 6, 2002
Working on the code that allows me to save stuff from one scenario into the context. It works if ’reap’ is shut off. Doesn’t appear to work otherwise. Not sure why. Tommorrow, I am leaving for California to go talk to press and stuff.
I created a routine to motivate scenario shifts based upon context affinity for something the user has said, and based on the probability that an inference might fire here or there. Rich hasn’t incorporated it into his code yet. The routine is called hlchoice(char *).
June 3, 2002
I am wondering why I made a routine called addintoctxt instead of simply using addtocontext.
June 4, 2002
Have a memory problem somewhere....
June 7, 2002
Thinking about parsing a little. I had an idea that the best way to improve the parse speed would be to compile all of the parse paths together into one big data structure, and hang exit points along it like berries on a vine. But thinking about it further, I think it might make more sense for me to simply cache the sub-matches.
For illustration, assume that the first part of an utterance includes a csubobj. All of the patterns that have a csubobj in the first position will be able to use the same succesful or failed parse attempt. The keys to recognizing an attempt are:
1) the memory address of the portion of the input string being parsed.
2) the memory address of the pattern being attempted.
The ordering of patterns can be preserved as they are now. The memory savings from avoiding re-parsing the same input should be phenomenal.
June 16, 2002
When a meme becomes active, Brainhat is handed the concepts and CCs from the database, and adds them to the context and context symbol tables (ctext and csyms). The context symbol tables are shifted in and out with meme changes, which makes the symbols specific to the meme being looked at at a given time. This is good, however, we wanted CCs that are generated within a meme to be available to all memes, along with their corresponding concepts. This is happening for the CCs learned along the way; they end up in a common context. But it isn’t happening for symbols because the context symbol tables (ctext and csyms) are shifting along with the meme id.
I propose that we
1) add symbols which are returned from the database to the
symtab and txttab (but make them rw == TRUE) for the current
meme. This is done in addtocontext_h.
2) shift the clean tables (symtab and txttab), but leave
csyms and ctext alone--not shifted.
The will make it so that symbols that are referenced in coversation will find a permanent home in the context symbol tables. This will eliminate the complaints from the program about symbols being missing.
It looks to me like scenario.c is doing the shifting now, but the shifting is taking place with the context tables (ctext and csyms) and not the read-only tables (symtab and txttab).
June 17, 2002
I made changes to addinfctxt. It should be called just before addtocontext in input pattern rules csent-inference{1,2}.
June 18, 2002
To do:
Make sure that Mark Miller’s tutorial for Brainhat as a VoiceXML server works.
parequires working for "what color is the ball?"?
June 19, 2002
I am grieving over something I grieved over once before and never completed. The question is what to do about saving away questions so I’ll have a copy of them before they go through the likes of declques or attrques. The issue is that if I save the question before it is answered, I might not save the "right" question, since there may be multiple candidate questions, each a permutation of the possible choices for filling word slots. Likewise, if I wait until the question is answered before I choose to save it away, I may end up with a perfectly good CC representing the answer, but nothing that looks like the original question.
So, how do I solve this issue? One possibility is that I could mclone the question and hang it off its own special tag. The tag could survive through the question answering portion of the code and I could simply store it away in addtoqs after I know which form of the question "won." I still have the QUOTE tag hanging around. I could use that.
Any other possibilities?
(later) I stole quotques for the purpose. Early results look good.
June 20, 2002
I am working my way through Mark Miller’s FAQ. I came to a place where, in his example, an inference is entered after a fact that would fire it. In order to make Mark’s examples work, I need to make it so that the inference *can* fire previously learned facts. Interested.... why doesn’t speculate do that already?
(looked) Speculate only trolls two CCs deep into the context at present. I need to give new inferences a honeymoon whereby they can be exercised at many points within the context. I guess I could do this by simply looking to see if the first or second member of the context is an inference template. That sounds good!
June 24, 2002
Hmmm.... the question is: how do I decide what facts are interesting to the honeymoon sweep mentioned above? I can create hashes for the CONDITIONs and CONSEQUENCE. And I can create hashes for arbitrary propositions in the context, and their parents. But I don’t want to enable every inference for every proposition...
One thing that occurs to me is that, by virtue of an aborted approach to speculating from the database that Rich and I took once, I have hash tags dangling from CCs now--at least for the concepts appearing within the CC, though not their parents. Maybe I could attach all of the hash values when I run speculate. That would keep me from having to repeat a fairly expensive computation.
Hash values in hand, the matter of deciding which inferences apply to which propositions is a matter of comparing tags. I’ll also need a way to apply specific propositions to inferences, if such a method doesn’t already exist.
June 26, 2002
A problem with Mark Miller’s example:
Mark defines accommodations-1 and gives "rooms" as a synonym. The problem is that "rooms" is already part of the basic pool of words. So, when one later tells Brainhat "you have a sleeping pod; if you have a room then a room is available," the inference does not fire because it is looking at the wrong "room." The fix (I suppose) will be to package the VXML server without the pre-defined room-1.
(a little later) Looking at it more closely, Mark has said "if you have rooms then rooms are available." THis does work, but it would be easy for one of his readers to make the same mistake that I did. I’ll try to soften the fall.
On another topic: I thought I had handily managed the honeymoon inference thing, as decsribed above. Looking at some tests, though, I see that the hash values that are being assigned to CCs don’t always match those of the inference template. Hmmmm....
(A little later) It just turned out to be the room/rooms thing biting me in the ass again. I also have to worry about the use of the word "vehicle" in the colony 7 example; it clashes with the default words file.
July 1, 2002
Hmmmm.... I am looking at an issue that cropped up once before: how do I update context-resident copies of concepts so that all of the non-orthogonal attributes are included. The exemplar is "if i am in the water then i am wet." When I tell the program that I am in the water, it makes the inference that I am wet. The issue is that the statement it goes to save looks like:
define Root-8e71 [999963534] label csent-inference-1 cause Root-8e4e effect Root-8e67 hashval 655 hashval 599
"speaker (in the water) implies speaker (wet)". It should be: "speaker (in the water) implies speaker (wet, in the water)". Need to understand why the in-the-water speaker isn’t around when it becomes wet.
The problem turned out to be updatectxt in the ponder routines. It was setting back the rlinks of concept that addtocontext had already updated, based on a primintive link count.
At some point, the algorithm for deciding when links should be updated from within addtocontext should go:
1) Are the links for the context resident copy and the copy within the context different?
2) If so, are they orthogonal?
3) If not, then merge and update the context.
July 6, 2002
Is the problem with test046 that the QUOTE isn’t comparing in a $Z match from within declques2?
Yep! That was it. QUOTE was being compared from within vrfy_xchild1.
July 7, 2002
I disabled the rampant assignment of HASHVALs from within tests.c. I wasn’t getting any value from it, and ddhashlink was consuming half of the CPU. Still, I have an issue to look into with honeymoon: why does the honeymoon inference not fire sometimes? If I say "i like food. if i like thing then i want thing", it works as expected. If I say "i like food. the ball is red. if i like thing then i want thing", it doesn’t work until I say "what do i like?"
Hmmm... the reason is that I am not storing all of the possible (parent) permutations of a CC in the context hash. When should I do that? In a ponder routine? Sounds like a good idea. I should be able to save some time in proposition2, I would think....
July 23, 2002
The component concepts of a CC that comes in from the database are not always being saved within addtocontext. The problem is that they come from the database as clean copies of things that were once dirty I need to make them Where should I do this?
July 29, 2002
I need to make it so that inferences don’t repeatedly fire. I had some ideas a few weeks ago....
Also, I need a routine to use instead of ’speak’ in csent-common. The new routine will say "okay" if the conchain is not empty. Or it will say "i didn’t understand" (or something) if the chain is empty.
The ’I might want to talk to someone’ problem I had a while ago appears to be an issue with satisfying TNP. In the context I have: "i might want to talk to someone" and "I want to talk to kevin." When I ask "do i want to talk to someone?," the fact that the ’someone’ matches brings me to a TNP comparison that pits future-imperfect against present. Hmmmmm.... I want the ’yes’ answer to win over the maybe....
The test case is:
i did want to talk to someone
i want to talk to mario
do i want to talk to someone?
July 29, 2002
"I must tell you something." "I hid some money."
August 2, 2002
>> what does it not have
Segmentation fault (core dumped)
August 24, 2002
"tell me what is the ball" has never been implemented. It sort of correctly says "what is the ball." "tell me you are glad" works, as does "ask me what is the ball." The best current kludge is "tell me the ball."
August 25, 2002
In the statue meme: "is the head valuable?" produces some inference-generated observations that end up in the discourse buffer. They should not be in the discourse buffer; the discourse buffer should be filled as the result of discourse!
(later) There was a kludge from August 28, 2002 where I made it so that self-talked stuff would show up in the discourse. This was coupled with speak being turned on for all non-questions. Now that I have speak generally shut off, stuff is getting into the discourse buffer even though it may not have been seen by the speaker. I need to undo the kludge, and I need to re-kludge (I think) so that stuff that goes through checkcommand as "tell the speaker...." gets into the discourse buffer.
I guess that if ’speak’ puts something into the global buffer, it should end up in the discourse. Also, it seems that there should be an mproc that calls speak to echo the input in the case that there is no pending output.
September 6, 2002
"I know you"
"What do I know?"
"you say to know me."
Hmmmm.... I want to see the ball. I know to go to the mall. I love to see the princess. I say to know you.
Some verbs work with the infinitive. Others don’t. I might need a parent and a cc-pattern to tell me which I am working with.
September 10, 2002
I am going to go through and try to straighten up all of the name stuff. It is pretty broken. I will run the tests first.
September 12, 2002
Went to Washington yesterday to talk to a Mr. David Steinberg. The plane didn’t blow up. Nothing blew up. Nice weather too.
"ask me what is my
name"
"ask me what is my color"
The first voices correctly, but the CC created is a broken. The second doesn’t even voice right.
September 17, 2002
Working on name-address assignments and uses... I fixed "my name is x" by reordering the rules in input-patterns.txt.
The other name-or-address assignments create CCs that look like this:
o extension / \ / \ / \ ATTR ATTR / \ "111" o o Root / \ / \ / ORTH PREP \ / \ / OBJPREP CHILD o o \ / adj of o Rich o adj
I need to make a couple of adjustments. For one, the restriction that the attribute be orthogonal to ’adjective’ is a little too strong.
Secondly, I need to make sure that the same treatment happens for ’name’s. Currently, the mproc that handles names adds a new label to the SUBJECT. I would like to continue to do that, but also create a copy of ’name’ with the attributes as shown in the example above.
September 23, 2002
If you refer to a ’he’ or a ’she’ and the program ends up finding a thing to resolve the ’he’ or ’she’ against, which is not a child of male-1 or female-1, BUT is a child of human.
September 26, 2002
Bug:
>> if a person has clothes then a person wears clothes. if a somebody has clothes then a somebody wears clothes. >> my friend has a suit your friend has a suit. >> mario has a sweater mario has a sweater. mario wears a sweater.
October 1, 2002
Need to turn indefinite articles into definite articles....
Need to think about making inferences fire just once on a particular fact in a particular time-frame.
October 4, 2002
Hints! Vocabulary!
Time is running short before speech tek, but I am thinking that I might want to modify the SAPI client and Brainhat to be able to handle word and sentence hints. I already feed the client a static list of word hints, but with the database in place, I can start making use of dynamic lists.
In addition, I thought I might be able to feed sentence hints to the engine too. These would come from ready inferences; the conditions could be fed to the speech engine:
do you ...
are you ...
what are ...
what is ...
subject verb object (speaker is happy, balls are red...)
subject tobe object
subject tobe attribute
The subjects and attributes could be fleshed out so that I get some of the taxonomic advantages of the vocabulary. I could 1) get the children of inference subjects and objects that I find in the context, 2) get terminals of subjects and objects from the symbol table--terminals being at the level found in the inference or one level below, with no children.
November 7, 2002
Rick Lewis is trying to use Brainhat as a FAQ server for a group of psychiatrists. He asked for some changes in the way Brainhat serves dynamic output. Rather than do the top.html and bottom.html thing we do now, he’d like to make the "form" statement choose what comes next. Here’s his suggestion:
<html> <body> <form method="PUT" action="http://my_brainhat/dialog?conv=2851&file=help.html"> <input type=hidden name="content" value="off"> <center> <brainhat_out>Welcome to Intelligence by Brainhat</brainhat_out> <p> <brainhat_content>Please enter an information request +sentence:<br></brainhat_content> <input type=text name="Stmt" length=40> </form> </body> </html>
The "conv" parameter was to be a cookie number. The "file" parameter described what file is to be returned with the next cycle (provided that Brainhat doesn’t "serve" a file as result of an inference). The "<brainhat_out>" tags describe where Brainhat should place the results of the previous evaluation cycle, or default text upon first evaluation. The "<brainhat_content>" area is for additional content, such as brainhat-generated hints.
If the "file" parameter were missing or the specified file were not found, then we would try to fall back to "default.html."
November 26, 2002
I need to look at condition/consequence pairs in some fashion to see if they have already tested true. That way, I can avoid re-evaluating things like: "if you are not a boy then you might want a dolly." I am thinking that I would like to do this in some cheap, reusable fashion. Perhaps I could keep track of hash numbers for conditions and consequences and avoid looking at things that *seem* to have been evaluated already (there is some danger that the same hash value pairs could appear for a different set of conditions and consequences, though it is not likely).
November 30, 2002
Thinking about it some more.... the reason why "if you are not a boy then you might want a dolly" gets repeated is that the tense gets changed on the way to the context. So when Brainhats tests to see if I "might you want a dolly" it finds "i do want a dolly" or "i don’t want a dolly." Testing the context for the consequences of an inference seems like an expensive way to be keeping track of whether to run inferences... I should look into the hash/cache anyway.
December 2, 2002
"what is the time?" breaks the database version. Why?
December 3, 2002
Found a buffer size issue. That might’ve been the problem....
On another front: hmmmm..... getconbyname is called from the ’s’ pattern as opposed to a call to searchsym. This is causing me to get just one ’s’ candidate. I was finding that the one candidate might be wholely inappropriate; it might be orthogonal when used. Anyway, I am replacing the call to getconbyname with a traditional symbol table lookup.
Later:
put the ball [in the bin]
move the ball [(for) 10 inches]
set the alarm [for 10 (thing)]
set the alarm [for 10 o’clock.]
Each is an adverbial phrase. When objprep has an attribute which is a quantity then do a "metric" and "magnitude" in the robospeak output. Otherwise, use "target" and "disposition" as the output fields.
December 4, 2002
The first part, ’set the alarm for 4’ goes okay, but the ’zegs’ part (interpreted as a new utterance) explodes.
December 5, 2002
Well, I am grieving over a number of things. Among the simpler ones is the ability for Brainhat to identify when it knows something. E.g., ’if you know X then....’ Brainhat could ask itself ’what is X?’ If the answer comes back as ’empty’ then the program doesn’t know.
Next, I really need to get around to eliminating verbatim mode and making idioms work.
I need to streamline the evaluation and caching of inferences. I also need to continue to add more structure to the way they are handled.
And, most pressingly, I need to handle *time*. There is some time functionality implemented now, but it probably won’t survive this next effort. The motivation is that we met with a group that is planning to put together a set top box with some intelligence in it. One of the applications would be an alarm clock, plus anything else that involves any form of scheduling.
I was thinking that I might create a new link type, TIME, which would hold the time in seconds since the epoch. This would be available for appending to CCs (so that we could age them in a ponder routine) and for dangling from This will give the CC two representations.
o hours / \ TIME / | \ ATTR / | \ o | o 10 103294324 | | ORTH o time
Chris is puting together an agent that will serve as an alarm clock for us. It will be on the other end of a TCP connection and will be available for such questions as "what time is it?" and "set an alarm for 109302043".
A program might say "if i say wake me at time1 then tell robby set an alarm for time1 and if the time is time1 then tell me to wake up." (Need to recurse on inference templates....)
December 6, 2002
Time is a thing. "11:30 on Tuesday" is a modifier of time.
o time / \ TIME / | \ ATTR / | \ o | o hours 103294324 | \ | ORTH \ EMPHASIS o \ time o 10
Supplemental notes. I am in Germany at Bettina’s house.
Some things to work on: 1) Time.
2) Cache inference results.
3) Need to fix C2 level caching in find.c for database version.
4) The flag settings going into checkcommand for statements of the sort "if i ask if ..." are still broken; the condition ("if i ask...") is being executed while we are meme-shifting in focustest.
5) Need to get back to making the operation robust and predictable.
6) Need to cut the cord w.r.t. to verbatim understanding of user input. In the same vein, I need to implement "if you know...." and that sort of thing.
Time waits for noone. I will start there...
"at 3 am"
o time / \ TIME / \ ATTR / \ o o am 2048958945... | | EMPH | o 3
"3 o’clock train"
o train / \ TIME / \ ATTR / \ o o o’clock 2048958945... | | EMPH | o 3
"wake me at 3 in the morning"
o Root / | \ SUBJ / | \ VERB / OBJ | \ o | \ brainhat o \ speaker o to-wake | | ADVERB | o Root / \ PREP / \ OBJPREP / \ o o time at / | / | ATTR o | 12892... o am / / EMPH / o 3
How do I tell when I need to substitute "time" as a thing to hang a specific time one? I guess I can look for an article. Nobody ever says "I will see you at the ten oclock." They can say, however, "I will see you on the ten oclock train." Or "I will see you at ten o’clock (time)." I guess "time" could be implied and replaced by a thing if one is available. That would make treatment uniform for "i will see you at the ten o’clock (time)" even though noone sys that.
Now, what about "I will see you monday?" That would be "I will see you (at) monday (time)".
Hmmm.... maybe all expressions of time should have an implicit "at." This would make "the ten o’clock train" be "the train (at) 10 o’clock (time)." "Wake me at 10 am" would become "wake me at 10 am (time)." "What time is the train" would become "where (when) is the train."
o train \ \ ATTR \ o Root / \ PREP / \ OBJPREP / \ o o time -------- o 04340243... at | | ATTR | o o’clock | | EMPH | o ten
December 18, 2002
"mario sees the red ball. does mario see the blue ball? does mario see the red ball?"
December 21, 2002
I am looking into a number of things from the list above. I’d like to get a 10 or 20 fold increase in the speed of inference evaluation. I think it might be reasonable to not call self-talk to evaluate the conditions of an inference. Conditions could be evaluated with a slightly better vrfy_xchild.
The other area that needs work is the generation of hash tests. When ponder discovers a CC, it looks into hashes for all of the parents of the concepts that participate in the hash. There can be a computational explosion if we are too deep in the taxonomies of the components. A verb five levels down, plus a verb 4 levels down, a noun 5 levels down, combine to create a group of 5x4x5 = 100 hashes to be investigated. It can be much worse if the hash components have multiple root parents. I just don’t have a good solution for solving this problem--a problem that Pascal tried to tackle in a scalar fashion so long ago. Is there a good algorithmic, linear-algebra-based or fractal solution?
At any rate, I really need to make some profiles to find out where the time is going. I could be making problems up....
test51?
December 26, 2002
Working on time. Adjectives used at emphasis are not checked for orthogonality. That means that you can get stuff like "10 3 pm", just as you could get "very not very big." Need to included emphasis in orthogonality checks.
January 2, 2003
I scraped the treatment of time described above. Time is a thing. Hours, minutes, days, etc. are things. The number of minutes, hours, etc. are attributes.
I am grieving again over literal interpretation of utterances by Brainhat. Ala, "if i say that a thing is red then a thing is blue." I need a flag that says that I should only interpret what the program hears, but not take it literally. This should live alongside the VERBATIM. In fact, I have changed the use of the verbatim flag. flag.
CREDITSPKR VERBATIM result --------------------------------------------------------------------------- FALSE FALSE FALSE TRUE the ball is red. TRUE FALSE the speaker says the ball is red. TRUE TRUE the ball is red, speaker says the ball is red.
I left off today looking at utter_what and utter_what1.
January 6, 2003
Fixing a compound issue with the sequence:
i am happy.
i was happy.
1) The phrases "was happy" and "am happy" are orthogonal, and so should be added to ’speaker-1’. addlink refuses to re-add ’happy-1’ to ’speaker-1’ because it sees them as being the same thing. addlink needs some new functionality.
2) tobecomp2 adds tenses to attributes, even in cases where it could cause othogonality violation.
January 7, 2003
"if i say that I am happy then i am red" "i am happy." "am i red" (maybe. you are red) The problem appears to be this gratuitous extra definition of speaker-1 in the local symbol table.
The intermediate labels are being stored in the symbol table to help with the database’s re-population of CCs. Hmmmmm.... Do i want these intermediate definitions in the symbol table? They are necessary for re-populating extra CCs, but I don’t know if they should be returned by getconbyname....
This is in routine addtocontext_i.
All that aside, the problem with "am i red" is tied to issues with the VERBATIM setting.... Looking into it.
January 8, 2003
Yesterday’s problem turned out to be nothing.
I just ran a profile of the database version cht-chatting about various things. This is what I found:
time seconds seconds calls ms/call ms/call name 16.88 4.16 4.16 4539494 0.00 0.00 value2int 14.73 7.79 3.63 4763832 0.00 0.00 srchsym 13.10 11.02 3.23 4811687 0.00 0.00 ccompare 12.21 14.03 3.01 29064250 0.00 0.00 lset_element 10.71 16.67 2.64 16041 0.16 0.16 keep_concept 5.76 18.09 1.42 4632155 0.00 0.00 init_cargs 2.76 18.77 0.68 4093417 0.00 0.00 cmatch_first 2.27 19.33 0.56 660256 0.00 0.00 procsubrule 2.03 19.83 0.50 661677 0.00 0.00 recursive_expand_rule
Routine alue2int is about to be replaced with a unique hash function that can be referenced in-line. I am going to init the hash at program startup. The object will be to use a small hash table and a function that uniquely hashs each symbol covered by value2int. The function may not be discovered until runtime....
January 13, 2003
I applied a cache mechanism to CC pattern matches, and to subproc rule expansion, but I am finding that in certain combinations, the caches don’t work. After some thought, it occurs to me that the issue might be that using a tuble of {con, rule, cserviceid} might not be sufficient to tell whether a preciously applied cc pattern match’s results are still valid. Consider the case where a CC is modified in-place. It’s address does not change, but its contents do, which would make the test above almost useless.
The fix might be to call clone_con() any time I affect a change to a CC or concept, and use the new address as the stand-in for the old one. Or, I could add a ’modified’ flag to the concept. This could become part of the tuple, and would solve the problem the would arise when I get a concept as an argument, but not the address of the things pointing to it. While I’m adding new things to concepts, I should also be thinking about relocating some of the concept elements that have to do with rule processing to another auxiliary structure.
If I did add a ’modified’ flag, it should really be a copy of the tag value at the time the concept was last modified. This would make it possible to recognize when something was modified, for comparison against the tuple.
A separate issue: I eliminated the use of T0 (a temporary) in favor of input-pattern caching. However, I am having an issue that the top-level pattern may be exercised more that once, which can cause side-effects. For instance, there’s an input pattern that is for gathering addresses of various sorts. It calls roboport, which makes connections to external robots.
January 20, 2003
Working on grammar....
Imperatives:
"play something" -- handled by imp-action-1
"i say play something" -- sent-action-10/imp-action-1
"if i say play something then you are happy" -- grab-adjective/sent-action-10/imp-quote-2 (wrong!)
"do i say play something?" -- question-action-1/imp-quote-2
Work notes:
imp-quote-2 is at line 1034: rule [$c‘enablers‘5! ]$s‘imperative‘3!$s‘speaker-1‘0!$s‘brainhat-1‘2!$r‘extactions‘1! [$r‘csubobj-prep‘0! ][that ]![if ]$W‘#‘4 imp-action-1 is at line 1059 rule $s‘imperative‘3!$s‘brainhat-1‘2!$r‘extactions‘1! $r‘csubobj-prep‘0 I am commenting out imp-quote-2 to see if imp-action-1 wins if not occluded. Yes! "if i say play something then you are happy" now works. "do I say play something?" also works now. I think that I need to move imp-quote-2 down a few patterns.... I moved imp-quote-2 down to the bottom of the ’imp’ section. All is good. Now I am looking at: "brainhat plays something" It is getting snagged as an imperative by imp-action-1. What do I do here? If I could identify that brainhat isn’t a robot then I could skip ahead to sent-action-10, I suppose. Looking at imp-action-1. Let me try that. Hmmm... the difference between "brainhat plays something" and "brainhat play something" is being lost in "imp-action-1." "play" is an imperative; "plays" is simply the verb. Currently, "toplay" is the parent, and the child of extactions. I could make sure that the VERB is a child of the third person. I am experimenting with ’extactions-3p’. I changed imp-action-1; now changing imp-action-2.
January 21, 2003
I am testing the new wildcard pattern match that Chris created for the input patterns. Created imp-action-4e.
Getting a clash when i say "winamp play king of the bongo." "king of the bongo" is stored as such. When I say "winamp play water song" I get something else. Difference (I guesss) is that water song is stored under "hot tuna - water song." Looking to see... why does rule imp-action-4b eclipse imp-action-4e?
imp-action-4b is at line 716.
imp-action-4e is at line 740.
Fixed by swapping positions. The issue was that the "king of" part of "king of the bongo" matched a csubobj+adverb.
I want "something else" to be a working thingy, as in "play something else." Currently, csubobj4 soaks up "something" and leaves "else" behind. Looking into how I might do this...
"what else do you have?" is matched by question-what-10. That calls ccdiscrm. Let me see if I can steal that....
A new pair of "csubobj"s that handles constructs like:
something else
another thing
a different thing
something different
I created condiscrm to see if I can make "i want something else" work. Testing by placing a rule just before csubobj4.
All working, though I noticed that ’something’ is an x-template, but I am not sure why. I want to say "something else..." in the course of discussion, but it won’t behave the same way as "thing else" would. Hmmmm.... What will happen if I make "something" a synonym for "things?"
I also want to understand why imp-action-1 couldn’t reasonably use csubobj-prep-q in lieu of csubobj-prep. I would want that so that I could take advantage of the new "else" patterns....
January 23, 2003
Divorced.
Need to run tests after yesterday’s changes. Then I need to make the last few changes mentioned at the bottom of the previous paragraph.
January 27, 2003
Forgot whether I ran the tests. Re-running them now. There are issues with test059 and test064.
I removed ’something’ as a stand-alone concept and added "something" as a synonym for things. No discernible changes in the test output. Checking the input patterns that nominated ’something’ directly. Looks okay.
For review, now: what’s the difference between csubobj-prep and csubobj-prep-q? Just for kicks, I added subobj0 and subobj0a--clones of subobj0-q and subobj0a-q to the class of subobjs. In testing, I noticed an issue reading from ’brainhat.init.’ Not sure what is going on there... Fixed.
Looking at csent-inference-1. Something wierd happening with conjunctions. Fixed.
Now I am looking at making the affinity ("wants") of a CC work a little better.
>> winamp hears.
>> if i say play something then winamp play a
different mp3
if You say play something then winamp plays a different mp3.
When rich adds "I have <something>" does he
also add the individual concepts associated with the CC? I
am getting lots of dirty messages from
"getconbyname" when I talk about the MP3s.
An issue with "play foo...." It used to be handled by imp-action-1b. Now it is being grabbed by imp-action-6.
I am wondering if Chris’ wildcard pattern, $q, should be part of the subobj set. Might be very handy. I could amend it so that it also looks for strict $c candidates too. The problem with "play king of the bongo" is that "play king" is working out to be a match....
I created a rule, imp-action-1c that uses the $Q primitive to match everything to end of line. This rule appears before any of the other imp-action-1x rules. I am hoping that this will allow me to test ’king of the bongo’ before a rule gets around to trying it as
End of day; I haven’t tested the above fix yet, but if it works, it will make "if i say play something then winamp play something" work reliably.
I also need it to work for play something by so-and-so....
January 28, 2003
I am going to see about adding the wild card patterns as part of subobj. I don’t know what this will do to performance, but.... (try it later).
If I say play an MP3 then... If I say play an artist then...
I am now grieving over why "play overload" works once, but not twice. Clues.... addtocontext looks to see if something is in the context once already. If it is, it doesn’t re-add it. This may be why the inference is not re-firing. I could make an exception for CCs with imperative tense.
This was the issue. Here’s an idea: I could swap context locations for things that are recently re-mentioned... pull them forward in the context. This could be effected by simply changing the chain links.
January 29, 2003
Hmmmm.... The inference that says "if i say play an mp3 then winamp play an mp3" keeps firing after I use it the first time.
I just added a $Q rule as a csubobj-stmt... testing. Look for csubobj-stmtwc1 if something breaks.
Look into ’exec’ flag for exercise of statements like It looks like the ’play overload’ part may be causing side effects.
Fuck. Lots of shit is broken now. I compiled a non-db version. Testing. Seems to be an issue with the db version.... non-db is okay.
January 31, 2003
Looking at failure of $Q pattern
in sentences like "do You say play 02 - king of the
bongo?"
...looking at $q, which captures the initial question: do
you say play king of the bongo? I added a pause to
imp-action-1d.
February 3, 2003
Fixed that.
I am running down messages from getconbyname wherein a concept that is presumed to be in the context is not being found. I am suddenly thinking that this might be related to verbatim mode. Testing theory by turning verbatim on. That made a difference. Will investigate. I am also interested in understanding whether addtocontext is getting called repeatedly durning meme shifting (and trying to save stuff).
Meme Set Focus: Brainhat Core Brainhat, Copyright (C) 2002, Brainhat LLC, Version 2.030127. >> winamp hears Me. >> hello hello. >> what is it? getconbyname: cannot find "greeting-a0c4" in symtab(s). getconbyname: cannot find "greeting-a0c4" in symtab(s). getconbyname: cannot find "greeting-a0c4" in symtab(s). getconbyname: cannot find "greeting-a0c4" in symtab(s). getconbyname: cannot find "greeting-a0c4" in symtab(s). getconbyname: cannot find "greeting-a0c4" in symtab(s). I do not know. > [dowd@maddie ~/brainhat]$ ./brainhat Meme Set Focus: Brainhat Core Brainhat, Copyright (C) 2002, Brainhat LLC, Version 2.030127. >> winamp hears Me. >> hello hello. >> break Break in debug at the start: debug> xspeak 1 You say hello. winamp’s address is 192.168.111.2. debug> xdump 1 define Root-a0bc [999958851] hashval 147 object greeting-a0c4 person second number singular tense present verb tosay-1-a0c1 subject speaker-1-a0c2
The issue is that greeting-a0c4 is not in the symbol table, which makes it impossible to locate when it comes to to store "the speaker asks what is greeting-a0c4." This only shows up in non-verbatim mode....
Found the issue. It occurs in lowpronouns.c, around line 320. I am looking into the symbol table for things that are referenced in the discourse buffer. In non-verbatim mode, there can be things in the symbol table that never make their way to the symbol tables. Ameliorating....
February 4, 2003
Working on a merge...
Unpacking the stuff I sent to Chris originally... into "deleteme". I will check to see what has changed since I sent Chris the source which he cleaned up....
February 5, 2003
Oooof. A couple of tough days.... A power glitch blew up maddie’s root partition yesterday. Today I was served papers because my landlord hasn’t paid his condo fee. I’ve been here *two weeks*!
February 6, 2003
"what do you have by the stones?"
I need to push the attribute "by the stones" down on the object as a requirement.
"do you have a house by
mario"
"do you have a house by luigi"
"do you have a house by mario"
There are myriad ’requires’ tags accumulating. See question-action-2.
"what do you have [that tobe] by mario."
the "[that tobe] by mario" section could be used to come up with an attribute that has tense (defaulting to present). It needs to be pushed down onto the object as a requirement when completed. The same form should work for "do you have a house that is red." This could be a kind of ccattr--"that is red."
Why isn’t there a csubobj for "house in the water"? I bet there is one, but that it caused issues. Let me go find it....
February 10, 2003
Working on winamp plug-in some more...
[dowd@maddie ~/brainhat]$ ./brainhat -u kdowd/kevin@127.1 +verbatim Database Enabled: brainhat@127.1 Mplex Brainhat Core Meme: Brainhat Core ID: 1 Location: localhost Memelist-Local: (1) Memelist-Extra: () Meme Set Focus: Brainhat Core Meme Cycle: (1 ) Brainhat, Copyright (C) 2002, Brainhat LLC, Version 2.030207. >> you have a house I have a house. >> do you have a house? maybe. I have a house.
This is broken....
Fixed. Now the question is why the portions of previous questions are being stored. I ask "do you have a block by mario?" The "block" gets "by mario" as a requirement. That makes its way into the context.
Hmmm.... What if I say "you have a red house. do you have a blue house?" Will that be screwed up too? Yes. That’s broken too.
Shutting meme shifting off, verbatim on. Using ’pause’ to find the spot where things break. Broken after addtocontext....
Let me take a look at "speaker asks house is blue."
Hmmm.... mpareq is creating the illegal house (blue + red). It turns REQUIRES back into ATTRIBUTEs. If you have "requires blue, is red," you end up with "is red, is blue." The nex question is "why is the broken result being stored into the context?"
Fix (I think): I need to check for ’auxtag no-object-context’ at the top of addtocontext.
Got it. Now I need to make ’you don’t have an mp3 by mario’ to work correctly.
February 11, 2003
Ooof. There seems to be a problem with the combination of memeshifting commands that the workstation uses in combination with reap. I don’t have time to look at it now because we are trying to pull together a demo. For the record, here is a clipboard of the problem.
[dowd@maddie ~/brainhat]$ ./brainhat -u kdowd/kevin@127.1 +verbatim +memerate +p trace Database Enabled: brainhat@127.1 Mplex Brainhat Core Meme: Brainhat Core ID: 1 Location: localhost Memelist-Local: (1) Memelist-Extra: () Meme Set Focus: Brainhat Core Meme Cycle: (1 ) Brainhat, Copyright (C) 2002, Brainhat LLC, Version 2.030207. >> break find: rule debug-entry matched: break FocusTest Meme Initial -1- Matched and Input Consumed find: rule debug-entry matched: break Break in debug at the start: debug> loginnocycle kdowd/kevin Database Enabled: brainhat@127.1 Mplex Brainhat Core Meme: Brainhat Core ID: 1 Location: localhost Memelist-Local: (1) Memelist-Extra: () Meme Set Focus: Brainhat Core debug> shiftnocc 1 debug> shiftoff Meme Shifting Disabled debug> cont >> brainhat has a house by mario. find: rule subobj1 matched: brainhat has a house by mario. find: rule subobj1 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule subobj1 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule subobj1 matched: brainhat has a house by mario. find: rule csubobj-ana matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule actions matched: has a house by mario. find: rule subobj1 matched: a house by mario. find: rule subobj1 matched: mario. find: rule subobj1 matched: mario. find: rule csubobj4 matched: mario. find: rule ccattr0 matched: by mario. find: rule sent-action-attr-2 matched: brainhat has a house by mario. find: rule sent-action-attr-2 matched: brainhat has a house by mario. find: rule subobj1 matched: brainhat has a house by mario. find: rule sent-action-attr-2 matched: brainhat has a house by mario. find: rule csent-common matched: brainhat has a house by mario. find: rule ccsent-top matched: brainhat has a house by mario. I have a house. >> brainhat has a hospital. find: rule subobj1 matched: brainhat has a hospital. find: rule subobj1 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule subobj1 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule subobj1 matched: brainhat has a hospital. find: rule csubobj-ana matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule actions matched: has a hospital. find: rule subobj1 matched: a hospital. find: rule csubobj-ana matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule actions matched: has a hospital. find: rule subobj1 matched: a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule csubobj4 matched: brainhat has a hospital. find: rule subobj1 matched: a hospital. find: rule subobj1 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule subobj1 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule subobj1 matched: a hospital. find: rule csubobj-ana matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj-ana matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule subobj1 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule csubobj4 matched: a hospital. find: rule subobj1 matched: hospital. find: rule subobj1 matched: hospital. find: rule csubobj4 matched: a hospital. find: rule tell-all-csubobj matched: a hospital. find: rule sent-nomod matched: a hospital. find: rule sent-action-10 matched: brainhat has a hospital. find: rule sent-action-10 matched: brainhat has a hospital. find: rule subobj1 matched: brainhat has a hospital. find: rule sent-action-10 matched: brainhat has a hospital. find: rule csent-common matched: brainhat has a hospital. find: rule ccsent-top matched: brainhat has a hospital. I have. >> break find: rule debug-entry matched: break Break in debug at the start: debug> xspeak 1 You say I have. I have. You say I have a house. I have a house. debug> xdump 2 define Root-085b-0869 [999997860] label sent-action-10 child-of things auxtag no-object-context hashval 11 subject brainhat-1-072f verb tohave-1-086a tense present number singular person first debug> list hospital (basic.pool.txttab) hospital->hospital-1 debug> sdump hospital-1 define hospital-1 [433] label hospital label hospital-1 orthogonal building-1 child-of building-1 debug> >> goodbye! [dowd@maddie ~/brainhat]$ ./brainhat -u kdowd/kevin@127.1 +verbatim +memerate +p trace +reap Database Enabled: brainhat@127.1 Mplex Brainhat Core Meme: Brainhat Core ID: 1 Location: localhost Memelist-Local: (1) Memelist-Extra: () Meme Set Focus: Brainhat Core Meme Cycle: (1 ) Brainhat, Copyright (C) 2002, Brainhat LLC, Version 2.030207. >> break find: rule debug-entry matched: break FocusTest Meme Initial -1- Matched and Input Consumed find: rule debug-entry matched: break Break in debug at the start: debug> loginnocycle kdowd/kevin Database Enabled: brainhat@127.1 Mplex Brainhat Core Meme: Brainhat Core ID: 1 Location: localhost Memelist-Local: (1) Memelist-Extra: () Meme Set Focus: Brainhat Core debug> shiftnocc 1 debug> shiftoff Meme Shifting Disabled debug> cont >> brainhat has a house by mario. find: rule subobj1 matched: brainhat has a house by mario. find: rule subobj1 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule subobj1 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule subobj1 matched: brainhat has a house by mario. find: rule csubobj-ana matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule csubobj4 matched: brainhat has a house by mario. find: rule actions matched: has a house by mario. find: rule subobj1 matched: a house by mario. find: rule subobj1 matched: mario. find: rule subobj1 matched: mario. find: rule csubobj4 matched: mario. find: rule ccattr0 matched: by mario. find: rule sent-action-attr-2 matched: brainhat has a house by mario. find: rule sent-action-attr-2 matched: brainhat has a house by mario. find: rule subobj1 matched: brainhat has a house by mario. find: rule sent-action-attr-2 matched: brainhat has a house by mario. find: rule csent-common matched: brainhat has a house by mario. find: rule ccsent-top matched: brainhat has a house by mario. I have a house. >> brainhat has a hospital find: rule subobj1 matched: brainhat has a hospital find: rule subobj1 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule subobj1 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule subobj1 matched: brainhat has a hospital find: rule csubobj-ana matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule actions matched: has a hospital find: rule subobj1 matched: a hospital find: rule csubobj-ana matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule actions matched: has a hospital find: rule subobj1 matched: a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule csubobj4 matched: brainhat has a hospital find: rule subobj1 matched: a hospital find: rule subobj1 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule subobj1 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule subobj1 matched: a hospital find: rule csubobj-ana matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj-ana matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule subobj1 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule csubobj4 matched: a hospital find: rule subobj1 matched: hospital find: rule subobj1 matched: hospital find: rule csubobj4 matched: a hospital find: rule tell-all-csubobj matched: a hospital find: rule sent-nomod matched: a hospital find: rule sent-action-10 matched: brainhat has a hospital find: rule sent-action-10 matched: brainhat has a hospital find: rule subobj1 matched: brainhat has a hospital find: rule sent-action-10 matched: brainhat has a hospital find: rule csent-common matched: brainhat has a hospital find: rule ccsent-top matched: brainhat has a hospital I have a hospital. >> break find: rule debug-entry matched: break Break in debug at the start: debug> xspeak 1 You say I have a hospital. I have a hospital. You say I have a house. I have a house. debug> xdump 2 define Root-085b-0869 [999997860] label sent-action-10 child-of things auxtag no-object-context hashval 877 subject brainhat-1-072f verb tohave-1-086a object hospital-1-086c-086e tense present number singular person first debug>
Looking at hlscore. I am trying to figure out why "input consumed" is not true when I say "if i say play something...." and then I say "play something...." Applicable pattern is "imp-action-1c."
February 12, 2003
At some point, I need to make debug output all go to dbgfp, and change that fp as I enter and leave debug. Currently, much output goes to stderr, even when in debug mode.
What happened here?
>> i like you
You like.
>>
Also, the inferences that are supposed to fire when I say "the user might say...." don’t.
The rule that handles these statements, csent-inference-2, records the part that comes after "speaker might say X." I want to record that the "the speaker says X." Doing a little work. I need an mproc that goes from:
o Root o Root \ \ \ COND ==> \ COND \ \ o "hello" o Root / | \ VERB / | \ OBJECT / SUBJ \ tosay o | \ o \ speaker o hello
Why does "play that" no longer work?
I want this to work: "i might like music" "if i say that i like music then ...." Am I limiting the chaining of inferences too much?
February 14, 2003
Need to start working in the workstation. I want the word ’something’ to refer to an mp3 in the music meme, which means that thing-1 and mp3-1 will have to be overrides.
Hmmmm... I am looking into why "mario sees the mp3" resolves a song from the context, yet "play a mp3" does not. The first is handled by subobj1, csubobj4, tell-all-csubobj, sent-nomod, sent-action-10.
The second is handled by subobj1, imp-action-1.
"What mp3 do you have. Play that." I am ending up with nothing because of some symbol table issues.... Look into this.
February 18, 2003
Workstation....
* created a music meme.
* removed "something" as a synonym for things.
* added "something" as a synonym for music.
* removed "this" and "that" as synonyms
for "it-1."
* added "this" and "that" as synonyms
for music.
Shut off sent-action-10a because it was getting in the way.
February 20, 2003
I am looking at using cycletags to track whether a concept has already been involved in the successful execution of an inference.
shutting off ’honeymoon’ changed ’csubobj’ in csubobj-stmt1 to ’csubobj-prep’.
February 27, 2003
if i ask do you have someone and you have an mp3 by someone then you are... the "you have an mp3 by someone" part is not handled yet.
"who is this by?"
March 5, 2003
Working on "what do you have by <foo>"
March 9, 2003
New web pages:
robot.html technicalspecs3.html conversations.html faqs.html
March 21, 2003
Sorry that I haven’t written lately. I fixed lots of crap, and we have a show coming up in San Jose in a week and a half. I need to make a note about a discovery I made. The statement "what is your name", answered in a meme 11 is captured by an inference that says "if i ask what is something ..." in meme 13. I hate scalars. They are really horrible approximations to the goodness (or otherwise) of CCs.
March 24, 2003
I currently set inference_triggered when it becomes apparent that hash tags have found something interesting--something that would fire an inference. Routine verify_hash_tags2 returns "MATCH" when there is a candidate. How about, if instead of "MATCH", the routine returned a scalar that gave some information about how distant the match was from the concepts that took part in the hash. In other words, the value returned by verify_hash_tags2 could tell how far up the parent chain we had to go to find a hash match. The closer to the original concepts, the better quality the match.
Hmmm, it seems that the call to ’getallparents’ is nearby. That’s where I want to get my scalar...
I am looking at the method I use for iterating over the parents. Basically, I create an iteration space that simulates iteration over the space of an n-dimensional array. The method of iterating is traditional: vary the value of one pointer until it reaches a termination value, decrement the next pointer along the way, reset the first value and restart the iteration.
The problem with using this method in cases where there the indices represent parents of the concepts within CCs being hashed is that it takes no account of the notion that the more deep the concept, the more abstract, the better the hash match. Take the lineages here, for example:
mario <- man <- human <- animal <- thing
to-see <- to-sense <- action
princess <- woman <- human <- animal <- thing
I really want to look for inference candidates that start with the most abstract parents. In other words, I want to raise my sights slowly, over all of the concepts when I search for an inference to fire. This will make the more abstract inferences tastier than the less abstract.
Using the current iteraction technique, these are the tuples considered:
mario, to-see, princess
mario, to-see, woman
mario, to-see, human
mario, to-see, animal
mario, to-see, thing
mario, to-sense, princess
mario, to-sense, woman
mario, to-sense, human...
Better that it should go:
mario, to-see, princess
mario, to-see, woman
man, to-see, woman
man, to-sense, woman
man, to-sense, animal
and so on...
I have an algorithm:
1) Take the lineage of all hash concepts (call
getallparents).
2) Count the depth of each, and
record as initial values i1, i2, i3...,
Set counters e1, e2, e3 to corresponding ’i’
values. Set p1, p2, p3
to hashchain pointers.
3) On each iteration
4) Process hash pointed to by p1, p2, p3...
5) step through e-counters and
identify the e-counter with the
lowest value.
6) Decrement all others.
7) If all have reached "1", then stop.
8) If any have gone to zero, restore original value.
9) Locate (again) the counter
with the lowest value. Advance
’p’ pointer
March 25, 2003
Hmmm....
That was something of a bust.
How about if I take the depth of tuples, add the components,
sort them, and then process them in order.
E. g.:
given,
mario <- man <- human <- animal <- thing
to-see <- to-sense <- action
princess <- woman <- human <- animal <- thing
mario, to-see, princess = 1 + 1
+ 1 = 3
mario, to-see, woman = 1 + 1 + 2 = 4
mario, to-see, human = 1 + 1 + 3 = 5
mario, to-see, animal = 1 + 1 + 4 = 6
mario, to-see, thing = 1 + 1 + 5 = 7
mario, to-sense, princess = 1 + 2 + 1 = 4
mario, to-sense, woman = 1 + 2 + 2 = 5
mario, to-sense, human... = 1 + 2 + 3 = 6
I need to limit the depth so that the search space isn’t too huge. I need to limit it anyway, so the computational burden isn’t too huge...
P1 P2 P3 ... PMAXPOS2
x
x
x MAX DEPTH.
I will create the space, and corresponding depth calculations, when I get the parents.
March 26, 2003
For kicks, I have been looking into more intelligent ways to look through the hash space non-exhaustively to see what interesting things might be lurking in there. The current method, decsribed several sections above, involves creating an iteration space from the lineage of each of the hash CCs and then checking each combination of parents for a hash match. It’s a lop-sided method at best, and doesn’t account for the notion that things near the bottoms of the parent chains are deliberately more specific than things near the top. Looking through the parent iteration space in a way that deals with lower-level concepts first, in a quasi-uniform fashion, will facilitate making *more specific inferences have precedence over less specific inferences*.
The iteration space can become huge--unmanageable, in fact--with just a few extra concepts. If the number of hash variables is N and the average depth of the parent chains is M, then the number of possible inferences I need to check goes to (need to look in my combinatorics books)... huge, anyway.
So here’s another approach. Take the parent chains (N of them) and sample the lowest concepts on the chain to see if there is an inference hash match.
Then, remove 1 lowest element from one of the chains and try again. The chain will be chosen at random. The process will be repeated a number of times sufficient to *probably* test each possible combination.
A M X
B N Y First time, test {D,N,Z}
C Z
D
A M X
B N Y Second time, test {C,N,Z}, (or {D,M,Z}, or {D,N,Y})
C Z
Then randomly remove two balls, some number of times (Chris is working on a formula for me), from the bottoms of randomly selected chains.
The mental image that comes to me is a piece of metal being dipped end-in into a corrosive bath. I want to wear away parts of the bottom--these being concepts on a parent chain.
April 3, 2003
I am on a flight returning from SpeechTek/AVIOS in San Jose. Dan, Rich, Chris and I all went. We had some good discussion with folks stopping by the booth, plus one scheduled meeting in Palo Alto. The downside was, as ever, that we received a lot of encouraging feedback, but from it were again unable to set a particular course. We chose one for ourselves. I’ll mention that in a moment.
Among the good stop-bys were a fellow from Convergys, a fellow from ‘Motorola, a very interested party from the Philips dictation engine group. Kevin Lento of Cepstral had a project we could look into with him. We also saw Lin Chase and re-kindled the notion of doing SMS/phone sex. A Joe Brown, former CEO of (name escapes me at the moment) stopped by and declared his intention to come see us in Hartford over the next month.
Dan’s and my visit to Palo Alto was with Steve Victorino, the CEO of ’There’ and an active party in a few other ventures. He was juiced about the customer service angle of Brainhat. It also appealed to him that Brainhat could server the role of helper and host within the ’There’ worlds. Next activity may include a trip so see one of his VPs located somewhere in Connecticut.
(Plus, Steve Chambers (SpeechWorks), guy from Mindmaker).
Anyway, I woke up the morning of the second day of the show and said to myself: "there’s some much conflicting direction and possibility, and we are running out of time.... we should now do the phone sex thing in earnest and see what follows." The developments of the last year are going to make it more possible: we have meme shifting and the ability to interpret what we hear, rather than take it all literally. A good implementation of the Brainhat side should get some press, derision and funds...
I am also grieving over what to do with Chris. He is capable, but is a constant mixture of immaturity, hubris and unpolished work. He takes advantage of the loose environment in the office and has by no means made himself indispensible. I’m of half a mind to let him go, but Dan seems to think he’s just "young."
The half-done stuff that needs completion are: the windows version of the code runs slow (because...), the SAPI client still misses stuff and produces turds (I am going to take this code back). I don’t know if the VoiceXML code works anymore following Chris’ changes (really need this!) I don’t know if the Web code works anymore following Chris’ changes for the windows version. I don’t know if the robot code works anymore following his changes. The Winamp client works well enough; there are issues but I don’t see any reason to continue with it in earnest.
Rich seems to be stalled. He was playing with the LBS stuff, and there may be utility in continuing. He has stopped working on the user interface. I think that the database and his abstractions for interfacing to the GUI seem good enough. The GUI itself is not very easy to use. I think it needs to be rebuilt on top of Rich’s hooks by another person--one who appreciates the usability problems with it; Rich is a little defensive about how it works.
As for the phone sex thing... this could be great fun, and may drag Rich back into the database business. We may even have him developing the memes.
April 8, 2003
Need to complete the algorthim for evenly, in a bottom-up fashion, running inferences. Need also to *not* test those concept parents that don’t appear in any of the inference templates. That should speed things up considerably; why test a parent if it isn’t in an inference template? Anyway, to do that I need a list of concepts that make up inference hashes on a per-meme basis.
April 16, 2003
I am working on an agent to go between Brainhat and an SMTP agent. I had thought about making Brainhat do it’s own SMTP reception, though that might be overkill. Then I thought that I might use smtp forwarding--headers and all. This would mean I’d have some functionality within Brainhat that would involve parsing mail headers. Again, probably overkill. Then I thought I might just place an agent between the MTA and a daemon.
(MTA) < ----------> filter <-------- > brainhat daemon
We will need to have some statefulness in the database. Rich is thinking about that. I will need to create some kind of indicator (an X- header) to tell me whether incoming mail is part of an ongoing conversation or the start of a new conversation.
From: fred@whatever.com X-bhat: X <more headers....> I like ice cream. <possible previous banter, signature>
If the X-bhat header is present, then this is an ongoing conversation. I will connect to the daemon, issue some kind of debug command to resurrect the state of the conversation, and then say the next thing. If not, then I will start the conversation afresh. Either way, I will record the results, go back into debug and save the state. The results will be returned to the user.
April 16, 2003
Hmmmm.... added REQUIRE tag back into list inside of autohash_a. Not sure why it would be missing...
May 13, 2003
I found a ’keep_clink(p)’ call inside of create_link(). I commented it out... just in case something explodes.
(later) Put it back.
May 15, 2003
Well, we decided to stop persuing Brainhat commercially. There’s lots to do, and Rich and I have some good ideas. But.... they’re going to have to be for the evenings.
July 3, 2003
Still here...
Hints, a new table in the database containing things we *might* hear, bestfit. We are trying to make a demo. Memory management matters still too. I think I want to do reference counts for all primitive objects.
July 14, 2003
Yep, still here. We’re running after a potential opportunity suggested by David Thompson of Fonix. The idea is that we might be able to make a play with cell phone companies to drive up minutes by giving teenage girls something to chit-chat with.
I have been grieving over how to do dialog management, and have even thought about front-ending with an AIML engine. Time is of the essence, though. I am going to craft an mplex to talk about boyfriends and mothers and creepy little brothers. The new components are going to be:
Manual meme shifting.
Use of rich’s hints table to generate grammars/hints for a speech engine.
Use of rich’s hints table to filter idiomatic or varied input.
Use of rich’s hints table to filter output (perhaps).
July 16, 2003
I created an input filter with grammar rules much like Brainhat’s existing input grammar. Excerpt:
define xlate1 label translation rule i see the $c‘things‘1 xlate mario sees the %s
I’m going to try an extend it so that I can handle CCs in place of single words, and likewise be able to invoke subrules in translation patterns, ala:
define xlate2 label translation rule i see the $r‘csubobj‘1 xlate mario sees the %s
Not sure how to do it yet...
July 18, 2003
This worked out as per the following example. The linktypes (ATTRIBUTE, VERB, etc) tell what kinds of speak routines should be used to voice the matching components. See idiom() for more details.
/* how old are you? */ define xlate1 label translation rule how old $c‘tobe‘1! $r‘csubobj‘2 xlate where ^1 ^2 at map ROOT,VERB,SUBJECT
Now I have been grieving over setting microstate variables so that I can activate translations for one cycle. For instance, I might want to say "what is your name?" and then be ready to accept a name in a translation, ala:
/* how old are you? */ define xlatex label translation rule $e‘expectname‘2$W‘#‘1 xlate my name is ^1 map ROOT,ATTRIBUTE,IGNORE
Here, the test of the string "expectname" as a state variable tells whether the rule can succeed at all. If the variable isn’t set, the rule fails.
Within the programming, an inference could say: "if .... ask speaker what is speaker’s name and set expectname." In the next processing cycle, "expectname" would be revoked as a state variable.
BUG FIX: I need to look at all rules to see if parequires is necessary. I found a place where it was needed: question-where-1.
July 19, 2003
>> name
if You are glad then You want to know your’s name.
>> break
Break in debug at the start:
debug> break tests
debug> cont
>> i am happy
replacing happy-1 with happy-a57f
replacing happy-1 with happy-a57f
replacing tobe with tobe-a580
replacing speaker-1 with speaker-a581
replacing speaker-1 with speaker-a581
replacing speaker-1 with speaker-a581
replacing happy-a57f with happy-a57f
replacing happy-a57f with happy-a57f
replacing happy-a57f with happy-a57f
replacing happy-a57f with happy-a57f
askcc: are You glad?
evalprop: yes (1)
tellcc: You wants to know your’s name.
You are glad. You want to know.
There’s a problem with expression of ’my’
and ’your’.
Found that use checking NOQA was eliminating pronoun dereference in lowpronouns by causing the routine to exit immediately. This is probably okay for "his" and "her", but may not be right for "my" and "your." I disabled the NOQA check to verify that the results would be what I hoped. I need to go back into lowpronouns and partition the processing.
July 21, 2003
I can change commitadj and the contents of the
input pattern tell-all-csubobj to make it possible
to commit things as answers to questions like
"what are you?" or "what is your name?"
The trick will be to recognize that there is a saved
question
of the form:
Root o / | \ / | \ OBJ | VERB / SUBJ \ o | o tobe things | o name
Then I can (ala commitadj) utter something along the lines of "my name is fred."
July 21, 2003
Possession needs to be orthogonal:
[dowd@maddie jennifer]$ ../brainhat -u kdowd/kevin@127.1 -i ./init +verbatim Database Enabled: brainhat@127.1 Mplex Brainhat Core Meme.1: Brainhat Core ID: 1 Location: localhost Mplex jennifer Meme.1: boyfriend ID: 14 Location: localhost Meme.1: yourboyfriend ID: 15 Location: localhost Memelist-Extra: () Meme Set Focus: Brainhat Core Brainhat, Copyright (C) 2002, Brainhat LLC, Version 2.030703. >> hi hello. i’m like ohmigod... hello. what is your name? >> my name is becky your name is becky. are You a boy? >> no You are not a boy. You are a girl. You might have my boyfriend. do You have my boyfriend? >> no You do not have my boyfriend.
Related issue:
I might have a boyrfriend --> I do have a boyfriend.
When I test to see if I might have a boyfriend, I should be
testing to see that I *do*.
I need to make it so that a given inference fires for a given set of hash tags just one time.
October 16, 2003
Re: Meetings in New York yesterday
The first meeting yesterday with
Russ Bartels. Russ has came to visit
us early last Spring, and took us into Lincoln Financial
Group to talk
about self-help capability rendered by machine. The
discussion didn’t
proceed because, at the time, Lincoln hadn’t ever
considered replacing
service agents in that fashion; the concept was new to
them.
Russ hasn’t abandoned the
idea. He believes that the market is huge,
and coming. His personal research has led him to conclude
that bots
are the easiest vehicle for creating virtual assistants.
I told him that I was coming
around to believe the same thing, but
that there is real value added by applying knowledge, goals
and
inferencing on top of the bots. I argued that this could
mean the
difference between an engine that likes chit-chat, and one
that wants
to get something accomplished. If bots are simple to create,
readily
available and everywhere, then being the intelligent
component that
rides on top of bots has enterprise potential.
I suggested that Russ could book
meetings to try and pitch this idea,
and that I would be his tech-weenie and we could see if we
could
close someone.
My second meeting was with
Richard Wallace, later joined by the
pandorabots folks. This discussion was very fruitful. They
listened
eagerly to my notion of adding intelligence to the bots.
Richard
reinforced the idea; he talked about seeing something
similar
presented by Sprint’s R&D group where an agent
that failed to find a
response for a question pitched it over to Alice for
chit-chat.
The pandorbots folks are willing
to "unwind" the program in any way we
need them to for hooks into Brainhat. They’re also
interested in
having our SAPI speech interface client for their community.
They
suggested that we could make a deal with paypal and collect
revenues
for it.
Richard and the pandorabots
people are seeing increased interest of
late, but like us, they haven’t made any money. I look
at the
components (including oddcast), the desire and the
technologies and
see the chance to cobble together an offering with a good
commercial
feel.
I will work with pandorabots to
get them increase functionality, and
perhaps revenue. That will bring us exposure to the bot
community and
increase our commercial potential.
-Kevin
October 18, 2003
"Do you know if mario is
happy?"
"Do you know mario’s mood?"
"Do you know mario’s address?"
"Do you know the speaker’s name?"
"Do you know who is the speaker?"
"Do you know who the speaker is?"
"Do you know what color the ball is?"
"Do you know what color is the ball?"
"Do you know how many people are coming?"
"Do you know where the party is?"
"Do you know if mario sees the ball?"
"Do you know who mario is with?"
"You want to know if mario
is happy."
"You want to know mario’s mood."
"You want to know mario’s address."
"You want to know the speaker’s name."
"You want to know who is the speaker."
"You want to know who the speaker is."
"You want to know what color the ball is."
"You want to know what color is the ball."
"You want to know how many people are coming."
"You want to know where the party is."
"You want to know if mario sees the ball."
"You want to know who mario is with."
"If you know if mario is
happy..."
"If you know mario’s mood..."
"If you know mario’s address..."
"If you know the speaker’s name..."
"If you know who is the speaker..."
"If you know who the speaker is..."
"If you know what color the ball is..."
"If you know what color is the ball..."
"If you know how many people are coming..."
"If you know where the party is..."
"If you know if mario sees the ball..."
"If you know who mario is with..."
"If you know everything...."
October 20, 2003
I added some code to addtocontext_i to make it so that symbols in ’object_no_context’ CCs would get saved into ctext.
I also switched the order of question-action-1 and question-action-2.
Eliminated sent-desaction-1a, -1an.
Changed csubobj-ana so that it recurses into csubobj-prep. This way, csubobj-ana is equivalent except that it saves whatever it matches as a possible anaphor. In the same vein, csubobj-ana (which calls ANAPHORS2) is stateful, and should be only referenced once per input cycle. Accordingly, I modified input-patterns so that csubobj-ana is referenced very near the top, and just once!
Changed csubobj-ana-q in similar ways.
Need to implement "you want to know..." as a future imperfect(?) Need to implement "(If) you know [that]...." as a yes/no thing. Should it return "yes" if I know the contrapositive? I think so.
October 23, 2003
Need to update hints. Chris’ client code was all fucked up. It would be easier to fix it if hints had start and end delimiters. Going to make the delimeters be ’(’ and ’)’.
Client is fixed.
Now, I am looking into
connecting to pandora from Brainhat. It’s stateless,
http-based. I need to:
0) Cull commandline options that say I am going to talk to a
bot: I need bot ID and server name/address.
1) Open a connection and say ’hi.’ From that I
get a session ID.
2) Run the input cycle in parallel: give the bot the user
input and give Brainhat the user input. When it comes time
to speak, read the bot’s output and look at
Brainhat’s output. Choose in some fashion.
October 27, 2003
I am trying to get ’i might be hungry’ to work. There’s a mess with the future-imperfect....
Hmmmm..... descon breaks polarset.
November 9, 2003
"Ask me what I like to eat" explodes. Fixed.
November 10, 2003
Don’t want to lose my place... I am asking brainhat ’what color do i like’. It is answered by question-what-7. I created subjobj-q-3 to catch "color (things implied)". doask gets a statement sufficient to recreate the question, but I haven’t yet futzed with the ask routines.
"I might be a woman" records that I am a woman.... that’s broked. Left off looking into getbesttense in rels.c.
December 3, 2003
"I might be cute" is causing a copy of speaker-1 with cute as an attribute to be created.
January 27, 2004
Actually giving some thought to VRSim’s stuff....
Turn up the lights. Turn the lights on. Turn on the lights. Turn the lights up. Turn the lights halfway up. Turn the lights up halfway.
The VRSim stuff is muddled and confused. The fellow working on it seems to like to make things overly complicated. I’ve seen it many times before... it is a way for technical people to bolster their worth.
Anyway, I am not sure what the interface between the simulator and Brainhat would be--at least from the transport level. The grammar of the interface is explained. In any case, I thought I would create a robot shim between the simulator and Brainhat.
Now I am looking at the Activmedia simulator. This looks like great fun:
Go to my desk. When you are at my desk get my pen. When you have my pen bring it to me.
Like the simulator, the interface can be a robot shim. Places can be defined as X,Y coordinates in the shim for now.
February 23, 2004
I am thinking again about The VRSIM stuff (which I haven’t worked on at all). The object of the game is (to state the superficial):
1) Correctly interpret some input.
2) Create appropriate semantic structures
3) (perhaps) produce some external commands.
Step 1 is a grammatical chore for command and control. There are a number of ways to say the same thing, and all of them are boring. E.g., "turn the light on", or "turn on the light." I don’t remember how I handled this in the past, but I am already groaning over whether "on" should be treated as an adverb or an adjective in these statements and in storage.
I added a facility that I think i called idiomize some time back. It was (is) meant to preprocess input in a coarse, pattern-matching way ala AIML, except that you can use taxonomic inheritance and recursive rules in the patterns.
I am thinking that I am going to make more extensive use of the idiomize capability to reduce the burden of trying to understand all of this command and control language from basic grammar rules. That makes step 2 much easier.
For step 3, I am going to use a robot of some sort. Scott of VRSim described an interface to simulation that uses http as the transport. I don’t like using http because it is synchronous. Rather than soil Brainhat with it, I am going to run Brainhat in daemon mode and have a robot make the synchronous calls.
Simulation C&C | | V ------------ | idiomize | ------------ | | V ------------ | reg. bhat | ------------ ^ | V ------------ | sim. robot | <----> til/fra simulator ------------
"Turn the lights on" makes:
define Root-8799 [999965320] label imp-action-1a child-of things-879b auxtag no-object-context hashval 439 tense present adverb Root-879a tense imperative subject brainhat-879d verb toswitch-879e object lights-87a2 define no-object-context [3364] label no-object-context define present [2259] label present child-of tense define Root-879a [999965334] label ccattr1a objprep things-879b prep on-879c define things-879b [1195] label something label thing label anything label things define on-879c [7079] label on label on-1 wants things orthogonal proximities child-of proximities child-of preposition
The "on" becomes an adverbial phrase with a NULL object of the preposition, e.g. "on things." Having a prepositional phrase in the adverb role makes some sense at the time the command is issued. You can think of the prepositional phrase as a modifier for the verb by thinking of the question:
"How should I turn the lights?"
"Turn the lights on"
"Turn the lights quickly"
The adverb needs to become a property of the lights, e.g. an adjective pretty quickly by the robot. Does that work? Let me check....
It does work, but by an ugly coincidence: the imperative "robby switch the bulb on" produces
start tag=000035099 command=to switch object=bulb disposition=on target=something stop
The statement that is stored with #35099 is "robby switched the bulb on." When the robot responds that the act has completed, Brainhat mumbles "robby did switch the bulb on." The auxiliary verb, "did", causes a different grammer rule to be exersized than we saw with "robby switch the bulb on." This alternate rule stores the prepositional phrase, "on things", as an attribute of the bulb.
The danger is that if there were an appropriate tense/number copy of "switched" in the vocabulary, then Brainhat would have told itself "robby switched the bulb on", which would have been taken as another imperative, in the past tense.
Hmmmm.... the present tense is the only acceptable imperative. I wonder if I have tackled this before... extaction-3p... January 20, 2003. Okay, switched was incorrectly identified as a child of extaction-3p.
I may have broken something.... or not, but my fix was to make the rule extactions refer to extaction-3p. The question is: is there any use for the parent extaction?? What was extaction-v? I think it was to help me differentiate between using the prepositional phrase as an adverb versus an adjective. E.g., "mario put the ball in the water" as opposed to "mario takes the ball in the water."
I need to look at imp-action-1a because it would match for both types of verbs--"put" and "take."
Many of the imp-action verbs can be folded into idiomatic expansion.
Ach! Chris didn’t know the difference between a socket and a file pointer... Lots of bogus comparisons in the code....
February 24, 2004
Here’s what I will do:
1) depend on extaction-3p to identify imperative uses of a verb.
2) depend on extaction-3pv to identify imperative uses of a verb where the preposition phrase that (may) follow the object is to be treated as an adverb. This will cover verbs like "put", "place", "deliver."
3) keep extactions-v so that I can differentiate non-imperative use of verbs that will count a prepositional phrase as an adverb (again, "put", "place", etc.)
Here’s what I found:
sent-not-action1a rule $r‘csubobj-prep‘0! $C‘not-enablers‘3! $r‘extactions-v‘1 ! $r‘csubobj‘2! $r‘adverbs‘4 imp-action-4a rule $s‘imperative‘3![$c‘totell-1‘5! ]$c‘xrobot‘2! [to ]$r‘e extactions‘1! $r‘adverbs-prep-solo‘4! $r‘csubobj-prep‘0 imp-action-4e rule $s‘imperative‘3![$c‘totell-1‘4! ]$c‘xrobot‘2! [to ]$r‘e xtactions‘1! $q‘things‘0 imp-action-4b rule $s‘imperative‘3![$c‘totell-1‘5! ]$c‘xrobot‘2! [to ]$r‘e xtactions‘1! $r‘csubobj‘0! $r‘adverbs‘4 imp-action-4c rule $s‘imperative‘3![$c‘totell-1‘5! ]$c‘xrobot‘2! [to ]$r‘e xtactions‘1! $r‘csubobj-prep‘0! $r‘adverbs‘4 imp-action-4d rule $s‘imperative‘3![$c‘totell-1‘4! ]$c‘xrobot‘2! [to ]$r‘e xtactions‘1! $Q‘mp3-1‘0 imp-action-4 rule $s‘imperative‘3![$c‘totell-1‘4! ]$c‘xrobot‘2! [to ]$r‘e xtactions‘1! $r‘csubobj‘0 sent-action-attr-2a rule $r‘csubobj-prep‘0! [$c‘enablers‘3! ]$r‘extactions-v‘1! $r‘subobj‘2! $r‘adverbs‘4 sent-action-14 * commented out rule $c‘brainhat-1‘0! $c‘extaction‘1! $Q‘music-1‘2 imp-action-1d rule $s‘imperative‘3!$s‘brainhat-1‘2!$r‘extactions-3p‘1! $q‘ things‘0 imp-action-1c rule $s‘imperative‘3!$s‘brainhat-1‘2!$r‘extactions-3p‘1! $r‘ subobjQ‘0 imp-action-1a rule $s‘imperative‘3!$s‘brainhat-1‘2!$r‘extactions‘1! $r‘csu bobj‘0! $r‘adverbs‘4 imp-quote-1 rule [$c‘enablers‘5! ]$s‘imperative‘3!$s‘speaker-1‘0!$s‘brai nhat-1‘2!$r‘extactions‘1! [$r‘csubobj-prep‘0! ]![that ]! [if ]![about ]$r‘sent-nomod‘4 imp-quote-2a * commented out rule [$c‘enablers‘5! ]$s‘imperative‘3!$s‘speaker-1‘0!$s‘brai nhat-1‘2!$r‘extactions‘1! [$r‘csubobj-prep‘0! ]if $W‘.‘4 imp-action-1b rule $s‘imperative‘3!$s‘brainhat-1‘2!$r‘extactions-3p‘1! $q‘ things‘0 imp-action-1 rule $s‘imperative‘3!$s‘brainhat-1‘2!$r‘extactions-3p‘1! $r‘ csubobj-prep‘0 imp-action-2 *actions-3p? rule $s‘imperative‘3!$s‘brainhat-1‘2!$r‘actions-3p‘1! $r‘csu bobj-prep‘0 imp-action-6 rule [$c‘brainhat-1‘2! ]$s‘imperative‘3!$s‘brainhat-1‘2!$r‘e xtactions-3p‘1! $W‘ ‘0 imp-quote-1d rule [$c‘enablers‘5! ]$s‘imperative‘3!$s‘speaker-1‘0!$s‘brai nhat-1‘2!$r‘extactions-3p‘1! [$r‘csubobj-prep‘0! ][that ] ![if ]$r‘simple-statement‘4 imp-quote-2 rule [$c‘enablers‘5! ]$s‘imperative‘3!$s‘speaker-1‘0!$s‘brai nhat-1‘2!$r‘extactions-3p‘1! [$r‘csubobj-prep‘0! ][that ]! [if ]$W‘#‘4 simple-statement-7 rule $s‘imperative‘3!$s‘speaker-1‘0!$s‘brainhat-1‘2!$r‘extac tions‘1! [$r‘csubobj-prep‘0! ]![if ]![that ] $r‘simple-statement‘4 simple-statement-15 rule $s‘imperative‘3!$s‘brainhat-1‘2!$r‘extactions‘1! $W‘ ‘0
February 27, 2004
What a couple of days I had! Not good, either.
I went down to Chattanooga to talk with some people doing a start-up called A.I. Tech. Software. At the outset, the discussion was to be about hiring me, though I have to admit that I had a separate agenda; I was looking for a contract. In part, because I don’t want to live in Tennesee. In other part because I don’t want to abandon Brainhat; they were talking about merging IP, but unless I am really confident, or unless someone is holding out some cash, f’get it.
Anyway, I met first with George Daoud, whom I came to like very much. We talked about their architecture for a CRM package, which included elements of text mining, NLP, etc. Brainhat does much of what they were planning to do. Adding Brainhat to the mix would probably cut their development time by 18 months, conservatively. It would be the largest component of their solution.
But they’re terminally cheap. A.I. Tech is in this depressing, split-level store front in a place called Red Bank, across from the Rusty Duck (biker bar) and a pawn/gun shop. The office is located in the upper deck with paper across the windows so that people can’t peek in. The other businesses include a Christian book store, a ministry, some going-out-of business gift shops and crap. The furniture is new at least; desks were $35/each, clearance at Office Depot.
I later met with Bill, the salesguy. He and George knew each other from a previous company, Exobrain. They painted a picture of despair w.r.t. their common experience. Exobrain was run by a meglomaniacal Brit who managed to run through $14M within two years. The money was raised entirely afoul of SEC regulations. There were lawyers. George and Bill both lost money--were both investors. It was in exobrain’s death throes that they met Mehrdad, the CEO and lead investor for A.I. Tech that I was to meet the next day. Mehrdad was part of a group that proposed to add funds to Exobrain to save it, which never happened.
That night, A.I. Tech put me up in a $41/night dingy hell-hole with ants, paper-thin walls and a threesome having a fucking contest next door. Wait til’ A. I. Tech comes here... they’re going to the Berlin Turnpike... I always wondered how turnpike motels survive, but now I know that they’re for business men with bloated egos and bizarre notions of start-up rites of passage.
Let me tell you about Mehrdad: he is a smart fellow--Stanford Ph.D.--who made his money in broadband modems, et. al. He thinks very highly of himself. Speaks in an avuncular, self-affected, wizened fashion. He’s got all the right contacts. He suffered a badge-of-honor heart attack. I should aspire to be like him. Full of shit, he is. He made some money. That’s all.
Mehrdad flew out on a red eye from California to meet me. If there were any chance of working there, it was already squashed by Red Bank, the Day’s Inn, etc. I was unimpressed by the austerity. Now the bad feelings were finalized:
We went to Chili’s for lunch, and for the hard-sell. Mehrdad has been lavishing compliments on me for an hour now. He has heard that the technology is good. Now, he wants me to pitch Brainhat in the mix and take a job in this dump.
I told him that the IP wasn’t mine alone, and that we’d spent a considerable amount of money developing it. Furthermore, I told him I’d seen no business plan, heard no offer details. He keeps playing the all-for-one, gotta-have-faith angle. I had to make it clear for everyone at the table that transferring the IP would be a merger, at the very least. We talk about that. Mehrdad suggests that Brainhat’s portion of a merged company would be in the single digit precentages-- "low teens at best." He goes on about how a small piece of something is superior to a large piece of nothing. He talks about his contacts again. It was pathetic. Eventually, I held up a nickel and said: "do you see this? this is Brainhat. it is the last coin I have to spend. i am going to spend it wisely (or put it back in my pocket.)"
There was an uncomfortable retreat from the conversation on all sides. We went back to the office for a while and I headed to the airport. I don’t know if I’ll ever hear from them again. If I do, it’ll be for a contract. I can shorten their development time by 18 months. Sheesh....
March 15, 2004
Beware...
I am at VRSim, working on a language interface to the simulator. Last week, I modified the robot code so that it could communicate with the simulator via HTTP gets, both inbound and outbound, ala:
Set up connectivity between Brainhat and VRSim’s simulator. Brainhat maintains a persistent connection on one side. VRSim’s thing uses http as the transfer. This complicates things a bit because I have to maintain stateful connections on one side and stateless connections on the other. Particularly, I have to mix an ’accept’ call for inbound connections with ’read’ calls for an open connection, all within the envelope of a select. ------------- ------------ ------------- | bhat daemon | < ---- > | this robot | <---- http get | simulator | ------------- ------------ http get ----> | | -------------
It works.
The next thing to do is make a basic translation from the robot code into the simulator grammar for command and control.
Scott Lovell has given me some samples. I will start with a basic command, such as "turn the lights on." Currently, I need to nominate a robot when I give an imperative. I am going to look into a way to grab the last robot referenced in the context as the target for an otherwise unspecified actor. So, for instance, if I say "turn the light on," it will apply to the simulator. One question for self... how do I disqualify that the command is for Brainhat, e.g. "tell me .... "?
Hmmm.... The code in checkcommand looks for an xrobot. I could instead look for a verb in the imperative tense. Then I could check to see if a robot is specified in the subject. If not, I could search the context for a robot as a replacement for the subject (which would probably be brainhat). If I don’t find it, pass the imperative to Brainhat.
I will want to move the code for "describe" up. Or perhaps it should no longer be treated as a special case. Then again, I might want to be able to say "robby describe x" as opposed to the default "[brainhat] describe x."
The command structure looks like:
X{O1},{R2},{O2},{R3},{O3},{OA}
Where:
• O1 = Object #1
• R2 = Relationship of Object 1 to Object 2
• R3 = Relationship of Object 1 to Object 3
• O3 = Object 3
• OA = Auxiliary field
The "X" signifies that this is a command.
The operands--O1, R2, etc.--are bracketed UUIDs. I am going to try to make these the first synonyms for each associated word for now. I may need some kind of UUID field later.
I am trying to make sense of the grammar....
X{sun},{set},{value},{datafield},{enabled},,
the subject is "sun," the verb is "set," the object ("value") is 0 or 1, the word "datafield" refers to to a restriction of the verb ("set") "enabled" says what the meaning of the datafield could possibly be.
X{die},{set},{sun},{datafield},{position},,
the subject is "die," the verb is "set," the object is the "sun," the word "datafield" refers to the use of the 0 or 1, and the "enabled" says what the meaning of the datafield could possibly be.
BUGREPORT:
In check_orthogonal: need to check for orthogonal
prepositions. Currently I only check for othogonal
objpreps.
I added some UUIDs for some words in Brainhat. I created TXTPENULTIMATE in dospeak. Then I made it so that robospeak would use the TXTPENULTIMATE, to get the UUIDs out. Now I need to package things the way the simulator would like to see them.
Something I am pondering: the ontology of the simulator is more complex than that of Brainhat. There are slots of a sort. I’m sure that Brainhat has some ontological structures that the simulator doesn’t have, though I haven’t found them yet. Anyway, the trouble is that I need to map one ontology onto the other. I am about to do this with code, but it’ll be ugly. Or could I do it with patterns? Is it reasonable to assume that everything will fit a mold? I could dynamically map things in some fashion....
Might I duplicate everything they have inside Brainhat--all the slots? Or?
Next question is whether they should be using something like protoge.
BUGREPORT:
turn the sun on. turn the sun off.
March 18, 2004
At VRSim:
I need to create a toxonomy of message types for
communication with the simulator.
I am going to need a grammar for parsing stuff coming back from the simulator.
I am wondering if I should be using XML between Brainhat and the robot code to simplify parsing, and make output viewable in a browser.
"Is the sun on?"
This is a request for simulator. Should I ask the simulation
or expect the answer to be found in the context? How do I
know that I have to ask the simulator? Should I tag
simulator objects so that I know that if I don’t find
the answer in context that I should go out to simulation
(rather than going back to the user?... If Brainhat asks
itself if the sun is on and doesn’t know, it will ask
the user...)
(later) I talked with Scott about his plans: he says that (eventually) there will be a way to introduce conditional code with side-effects attached into the simulation cycle. Conceptually, in English: "if the sun is on then tell brainhat that the sun is on."
Big architecture issues:
• I need to recognize that particular objects are simulator objects--probably via some common ancestry to a simulator object.
• Likewise, I need to recognize that when I want to know the answer to a question that the simulator can answer, I need to go to the simulator.
• Eventually (but when?), I will be able to hand the simulation cycle some kind of executable to test conditions and tell me when an interesting state change has occurred.
• I need to restrict object/verb association to better match the simulator’s ontologies. Brainhat is perfectly happy that the user should, say, wear the sun or eat the sun. These actions will be undefined in simulation.
• Need to architect a way to traverse Scott’s ontology and replicate the salient parts into Brainhat taxonomies. I asked Scott where the source of the ontology is--is it a flat file, for instance. Currently (or forever? dunno...) the ontology is being crafted in C++ code. I suggested he look into Protege.
I have some grammar brokeness to fix.
Well, for the near term, the VRSim people need to have some idea that this is going to work. So, I’ll dress up what I have done thus far, with an eye toward re-use. I will:
Create a little grammar to go
with simulation.
Add a few more concepts/UUIDs.
Categorize some command types and hand-integrate them into
robotspeak.
March 22, 2004
At VRSim.
Tim sent this as a proposed dialogue:
Conversational Control
Dialogue 1:
User: System
System: Working
User: I would like to build a simulation.
System: What kind of a simulation?
User: One that shows the JSF
CTOL variant engine, and highlights the
fuel system.
System: How do you want to interact with the simulation?
User: I want to be able to
remove the parts of the fuel system
with the standard set of hand tools.
System: OK this will take some
time. While you are waiting I can
display the engine and let you walk around it. Would you
like to
do that?
User: No, send me an email when it is ready.
System: As you will my lord
Hmmm....
I think this all looks like a lot of fun, but I’m not sure I would know how to get the simulator to do these things, or how to get it to tell me something will take a while.
A more skeletal version of the above might be:
User: System
System: System is working
User: I would like to build a simulation.
System: Describe the simulation.
User: Show the JSF CTOL variant
engine and highlight the
fuel system.
System: Describe interaction.
User: I want to be able to
remove the parts of the fuel system
with the standard set of hand tools.
System: OK this will take some
time. While you are waiting I can
display the engine and let you walk around it. Would you
like to
do that?
User: No, send me an email when it is ready.
System: I do not know your address.
I can see use for an expanded version of translations and idioms.
(10 minutes later) I expanded idiomize so that it runs recursively.
Now, I am turning my attention back to the simulation taxonomy. The next class of statements i am going to tackle are status inquiries, such as "is the sun on?" To interpret the answer I will need:
1) to understand that I am asking a question that needs to be delegated. This knowledge will have to be derived from the "wants" of the thing in question. E.g., "is the sun on" will cause me to look beyond the context by virtue of the fact that the "sun" wants the "simulator," and "simulator" is an xrobot. I will only go outboard if I know that the simulator hears me.
2) i need a simple grammar to parse the response from the simulator simulator.
Later, I will come back to scalar values, e.g. "where is the x...?"
So, I looked at attrques2 to see whether it might be a good hook into the process of asking the simulator about the state of something {on, off, up, down, 3 inches to the right, whatever...}. It could be. I would need to generate some work for checkcommand, perhaps.
The problems are: I don’t want to cause context state changes. Consider that I might ask "is the sun on?" I am going to go to the simulator for the answer. I don’t have any primitives for caching the answer temporarily, so I will have to go to the simulator every time I need the answer (that’s problem #1). Since Brainhat asks itself questions repeatedly, the simulator will have to answer repeatedly. Is this going to be a problem?
Assuming that the context doesn’t keep copies of state for simulator variables, the next problem is what do I do about changes of sinulator state in mid-inference. Consider a situation where Brainhat is evaluating something based on whether ambient is turned on. As the inference is being evaluated, someone grabs a knob (or something) and turns it on. What does that do to Brainhat?
I just talked to scott about the question answering.... He says that I will have to wait a whole cycle for the answer, so I guess I will need to cache copies of the answers and age them somehow. I guess that this means that I will have to commit the answers to context, with some form of time-stamp, and then periodically rip the context-resident answers out.
Okay, so here is what I will do:
1) If I can’t answer an xrobot-related question directly from context, redirect the question to the xrobot. So, for instance, if the question is "is the sun on?", it will get patched up to look like: "ask sim is the sun on," based on the fact that the "sun" is associated with "sim." Not sure where this happens in the mprocs, but it’s in there somewhere....
2) The question will get issued by checkcommand.
3) The xrobot (simulator) will answer (eventually). Brainhat will hang around waiting.
4) The context will be updated with the new information when the answer comes back.
5) a "ponder" routine will walk through the context and strip out the information as it ages.
Starting with "ask sim if the sun is enabled..."
March 24, 2004
I am grieving over the use of "on <something>" and "off <something>" as used for the likes of "the sun is on <something>," et. al. The words "on" and "off" are prepositions, and I need to map prepositions to Scott’s ontology as "get/set position." Consider "put the pointer on the target." However, the use of "on" and "off" for talking about lights is idiomatic. Likewise "turn the light on" or "turn on the light" is idiomatic. It really means enable/disable.
So, if "on" and "off" are adjectives, then the statements above would need to work when I say something like "turn the light blue" or "turn blue the light." This is a mess as well.
Assuming that I don’t
define "on" and "off" as adjectives, I
want the least kludgy handling I can find. Three
alternatives present themselves:
1) hand-craft code to recognize idiomatic use of "on
something" and "off something" and stick it
in the simulator interface.
2) motivate enabled/disabled by inference, e.g. "if a
light is on then a light is enabled"
3) translate the actual input via "idiomize." That
way, before Brainhat tries to parse the input, things like
"turn on the light" will be rewritten to
"enable the light." Note that this is won’t
catch use of "if the light is on" in inference
templates.
Trying the above. Also, I have to re-tweak the simulator interface so that it maps "enabled" correctly.
(later) Argh! This UUID crap is so arcane. I’m trying to read the files Scott produced in lynx, but it doesn’t interpret the contents of the {... horrible UUID thing ...} as text/html. I’ve been bashing at the files to change the names and internal refernces, but I am going blind because of these stupid brackets. I’d be happy just editing the files, but then again, the names are of the form {... horrible UUID thing ...}. Does it have to be so complicated?
Need something just before imp-action-1 to handle statements of the sort: "set the light [to ]enabled" or "make the light enabled."
Okay.... tired.... Need to
create test cases for:
sim set the sun enabled.
set the sun enabled.
sim set the sun enabled.
sim set the sun to enabled.
if i say enable the sun then sim set the sun to enabled.
and so on...
March 25, 2004
At VRSim.
Made a big improvement in cycle detection within evalprop2 so that inferences don’t get run twice. HOWEVER, this is not going to be good for situations where an inference can get run twice for the same set of hash tags. To understand where this might matter, consider that we might be running in a scenario where the say inference condition is met twice (e.g. "disable ambient light; enable ambient light; disable ambient light"). The question is, how do I remove the cycle detection so that the inference becomes a candidate again? I think this is part of the big issue of "back-out." In some way, parts of the context, the hash, the inference hashes will all need to be connected so that I can undo things. I don’t think it’ll be all that hard...
March 25, 2004
At VRSim.
Some folks from EB were in here yesterday looking at a welding simulation. I was ostensibly watching the demos so that I could see the kinds of commands being issued--for modeling of dialog by Brainhat. Partway through, one of the EB folks started talking about having an virtual instructor in the demo to tell you that things are going well, or not going well, etc. So, I think my focus shifted....
That may be okay; the more I do to meet the initial requirements--language-based simulator control--the more apparent it becomes that the infrastructure to use the interface is a ways off; Scott is doing a fair amount of production work.
March 27, 2004
Hmmm... I think that I detected that Tim Gifford is asking whether simply having speech output wouldn’t be sufficient for the welding application. It’s probably true; some canned output here and there would do just as well (or better) than having the output be semantically driven.
Since we are a long way from having the infrastructure to control the simulation, Brainhat is sort of out of the picture for the time-being. In fact, it makes the work I did so far little more than a preening exercise. I don’t ever want to hang around burning through a budget, and since the money isn’t very good anyway, I am going to wander off for a while until the need for NLP becomes real.
On another note: I see that we have a little following in the chat groups talking about bots and stuff. I traded email with Dan Collins last night about going the last few feet to make the source code available for developers. I think I’ll go ahead. I really need to stop imagining gainful employment or lucrative business coming from Brainhat. I need to get back into network infrastructure.
Today (Saturday morning now) I am cleaning up the code. I got rid of all the bprintf crap that Chris put in. I got rid of the PARMSETMATCH junk from Rich’s excursion into grammar-generated output. I need to skinny the size of a struct concept down so that every copy doesn’t have the baggage of a rule along with it. I need to go back and shine up the memory management and add string allocations into the picture.
April 14, 2004
I just arrived from Germany. I went to visit Bettina’s folks after dragging my feet on it for quite some time. I worry that I have burned all bridges at this point. Paula is admanant that she’ll isolate me if I have anything to do with Bettina. Away for six days, and not a single phone call. I’ve really gotten to the point where I don’t matter at all.
The people from Red Bank say that they’re interested in exploring acquisition of Brainhat again. I can’t get Dan or (especially) Rich to even write me back about the project’s future. I guess I got all the telltale support I needed when I ended up cleaning the office out by myself.
What the hell am I going to do with myself now?
June 8, 2004
I am back at VRSim. The piece that I am interfacing to has changed a bit. My last stab at it had me changing the state of lights, etc. These objects are now gone. The new objects are more basic--floating point numbers, etc.
I am going to work on reducing the vocabulary and scope of dialog in Brainhat in favor of command and control. Consider the following dialog:
create a float.
call it testvalue.
add 3.14159 to it.
add 2.718 to it.
what is the value of testvalue?
In this dialog, all of the work of allocating and arithmetically manipulating the float is relegated to a robot.
I also need a more general and more efficient pre-processor for input (ala AIML or ’idioms’).
June 14, 2004
At VRSim, 1:00 PM.
What would be a good comcon grammar?
Some possible commands:
new {thing} {thing} is attribute {thing} is not attribute make {thing} attribute delete {thing} put {thing} [to ]place move {thing} [to ]place move {thing} {how} where is {thing}? show {thing} hide {thing} get {thing} new {number} add {number} to {number} what is {number}?
I will need to:
1) organize and streamline the vocabulary to reflect com/con.
2) organize and streamline the grammar to reflect com/con.
3) create/organize an idomatic language base for com/con.
4) create test cases specifically for command and control.
I will plan to retain the conversational ability of the program, but pare a fair amount of the vocabulary. My changes will be along the main truck of development--not a branch. That means that I’ll save what I have today and toss it out of the distribution.
Left at 3:00 pm. Pruned the vocabulary.
June 15, 2004
At VRSIM. Planning to work on comcon statement grammar using examples above.
When I say "new {thing}", I want to create a {thing} that is orthogonal to all others. How shall I approach that? Perhaps I could recognize "new thing" in grammar and have an mproc routine that tosses all dirty copies off the chain. That sort of works... Actually, I’ll add an mproc to the stack for an existing rule-- tell-all-csubobj.
That seems to have worked. There is a new routine called brandnew that implements the above effectively.
June 17, 2004
I am going to VRSim later today. They are having a cadre of visitors from EB or somewhere. Then, I am supposed to hook up with Rick Romkey and Scott Kasper tonight, presumably to drink too much, though I have absolutely no interest in that.
Where was I?
June 21, 2004
Back at VRSim to spend the day hacking.
So, here is a basic sequence of things I want to be able to do:
create a light (simulator indicates that light is created) make it red (simulator indicates that light is red) turn it on (simulator indicates that light is on) is it on? turn it off (simulator indicates that light is off; actual dirty copy of light must change in Brainhat) destroy the light (simulator indicates that light is destroyed; actual dirty copy of light must disappear from Brainhat)
Idioms are really working here. I need to think about how to make them more efficient, and how to make them apply on a meme-to-meme basis.
I created an mproc routine, brandnew, for making fresh copies of clean stuff from the context. Now I need to find a way to destroy stuff ("destroy the red light"), and how to create copies of things with an orthogonal attribute modified ("make the blue light red").
Implementing "destroy"
seems straight-forward enough. I’ll have to think
about how to change attributes on things....
Hmmmmm.....
If I am dealing with an orthogonal attribute (e.g.
red/blue), then I already have the appropriate link type
hanging from the "thing." I could crawl through
the context changing the copies of the "thing."
Likewise, I could make any verbs that touch the attribute go
into the past tense. E.g., "the ball is red"
becomes "the ball was red." Similarly, "mario
sees the red ball" could become "mario saw the red
ball."
Hmmmm....
First, I’ll deal with "destroy." "Destroy" will work through checkcommand because it is an imperative (just like "create...."). The difference is that Brainhat doesn’t know anything about the connection between the verb "to create" and the magical creation of a new thing. "Destroy", on the other hand, involves some thought and some processing.
I could make it so that Brainhat
deletes all of its copies of the thing to be destroyed
before/whether the simulator has to do the same. The only
problem is that I might want to wait for the simulator to
have completed destroying something before wiping out my
copies. Otherwise, Brainhat will no longer be able to talk
intelligently about the thing that was supposed to have been
destroyed.
Hmmmm....
I guess I could do the following:
1) destroy a "thing" in checkcommand if there is no robot to field the operation.
2) delegate "destroy" to a robot, and then do the correct thing when the robot has completed the destruction. A message will come back like "brainhat did destroy ’thing’". I can either field the past tense/non imperative "destroy" command with a grammar rule, or I could rewrite the statement into another imperative, e.g. "forget ’thing’" for execution by checkcommand.
MAYBE I don’t want to delete the ’thing.’ MAYBE I want to have some memory of it. MAYBE I want to make it no longer available in the dirty symbol table, but otherwise be able to answer questions about it. This sounds right....
June 29, 2004
Back at VRSim to spend the day hacking, 9:00 AM. So many days slip by in between....
I am talking to Dan Collins about getting Brainhat back. There seems to be some traction in the conversation. The current proposal is that I give him $50K and he gets 10% of whatever is returned in a sale.
Scott and Tim just came to see me. They are asking to be able to elucidate what the simulator is doing. Scott is sending me some new vocab.
Working on my rack-mount in the simulator room....
Hmmmm...
I am looking at the current vocabulary. There are some big
semantic differences between the insides of Brainhat and the
sim-sim code. This is not a surprise, however the challenge
of bridging the semantic gap cleanly is an issue. I have two
basic choices:
1) ingest the semantics of sim-sim into Brainhat,
2) write a piece of semantic bridging code to fill the gap. If I were to write the bridge code, it would belong in vrsimipc.
Further hmmmm..... I am looking at the examples and I don’t see semantic consistency in them. Not sure what to do here.... I guess I could hard-code/map everything I see. Ugh.
Here’s my solution, described in a note to Tim Gifford:
Tim, In my earlier experiments I pushed some notion of the semantics of the simulator into Brainhat and asked the code to create simulator EndeaVR relationship records directly. It worked in the limited examples I had on hand. I thought I could keep going in that direction, but it is becoming clearer that some of the semantic differences between the simulator and Brainhat run pretty deep. For instance, the simulator has verb primitives for "get position" or "get orientation." In Brainhat- land, these are verb/object pairs. Rather than make fundamental (and potentially large) changes in Brainhat to adopt the simulator’s semantic representation, I think I am going to create a semantic matrix map for stuff coming into Brainhat and going out to the simulator. This is where the vocabulary mapping will occur too. For example: Brainhat -> sim command subject object orientation etc.... map index --------------------------------------------------------------------- get * * left [1] get * * * [2] enable * light * [3] .... and so on... Upon a match: 1) look-up semantic mapping via index, 2) resolve vocabulary components, 3) create EndeaVR record and send 4) harvest status sim -> Brainhat {O1},{R2},{O2},{R3},{O3},{A} --------------------------------------------------------------------- {*},{*},{specific UUID},, [a] {*},{sepcific UUID},{*},, [b] .... and so on... Upon a match: 1) look-up semantic mapping via index, 2) resolve vocabulary components, 3) create skeleton english representation 4) send to Brainhat for elucidation. I don’t think this will take too long, and it will allow me to be flexible in accommodating changes in the simulator as it develops. -Kevin
Left at 1:30.
July 1, 2004
At VRSim, 9:00. I have a few things to do here and elsewhere today. My new office is ready for occupation. I need to go to the bank to straighten out Brainhat accounts. I think I am on the verge of being offered two jobs--one by the state, and one by Sonalysts. It will be difficult to decide what to do. I’d like to make a further go of Brainhat, but there are no paying customers to date....
Now, to work on the map...
(later that afternoon) ... I left at 2:00 and returned at 3:30. It is now 4:30. The mapping thing is coming along. I have symbol tables for going from UUID to English-like descriptions. Now I need to substitute the English-like stuff into Brainhat pidgeon speak.
July 14, 2004
I am at VRSim for a little while. I finished the simulator -> brainhat mapping piece, but haven’t tested it yet. I’m hoping that Scott can get his code running before I have to leave at 11:30 to meet with Dan.
I have two job offers--one with the state and one with Sonalysts. The Sonalysts job would probably be more interesting, but perhaps more restrictive. It is in Waterford, which makes contact with the boys a little more of a challenge too. Both offers are verbal as of yet. The state offer is a little complicated by the fact that the governor quit and the lieutenant governor took over. She asked for resignations from a number of department heads including the head of the state’s IT services. That is delaying some of the process flow. What to do?
August 31, 2004
I am working on the interface to the VRSim code again. What an abortion.... Anyway, I can’t see anyway to ask if a sphere is orbiting. I can ask "what is a sphere?" Is that good for anything? Dunno.
September 8, 2004
I have completed a loop from Brainhat to simulator and back where I can say "create a sphere", see it on the screen, and get "simulator made sphere" back. The semantic mapping of the simulator response is being done via a map. The translation from Brainhat robot interface representation isn’t done yet--I hardwired it. Now, the question is what should the interface look like?
start tag=000033753 command=to make object=ball stop
The fields are not fixed--they could be whatever I choose to create in robospeak and recognize in vrsimipc’s translation tables. I need a good structure to contain type/value pairs. It has to have wildcard capability. It has to have a facility to map into simulator speak.
So, here’s an inefficient idea:
I have rules that look a lot like the API syntax, except that they can have wildcards and they have an extra ’xlate’ field.
start command=to make object=* xlate={^object^}{create from}{vis node}.... (whatever) stop
Hmmm.... question is, how do I deal with classes of stuff? I don’t have any inheritance information available to me from within vrsimipc. I guess I could add a facility to make object (or whatever) be a list of alternatives. E.g.,
start command=to make object=(sphere,cone,block) xlate={^object^}{create from}{vis node}.... (whatever) stop
The structures I will need are:
struct { struct pair *pairs; // the chain of type/value pairs (and their alts) char *xlate; // the substitution rule for a match struct rule *next; // the next input rule } rule; struct { char *type; // command, object, disposition, etc... struct alt *value; // list of alternatives for the given type struct pair *next; // pointer to next pair } pair; struct { char *text; // the value of an alt--sphere, cone, etc... struct alt *next; // pointer to next alt } alt;
February 2, 2005
Wow. That was really something. I’m glad that’s all over.... The last time I looked at the code was VRSim. So much shit has happened between then and now that I can’t explain it here. Besides, I am on a plane to Las Vegas with Christie and Bettina (ex at this point), and Christie is leaning her head on my shoulder, so anything I might say would get read....
As for the code... I traded IM with Richie the other day. He has a friend who was in a grad class at UConn. Brainhat came up. The teacher--a Shank proponent--knew of Brainhat, and had issues with it. Interesting that we’ve made some impact at least.
Anyway, I said to Rich that it’s probably time to get going again, and he agreed. We have done lots of amazing stuff--most of which is neither visible nor documented.
Anyway, I don’t even remember what I thought I needed to do next, though I know there was a lot to do. I expect that it will all be in the notes here somewhere... In a global way, I think it might make sense to start thinking about the project as a KR effort, or even a "conscience engine." The dialog space is muddied with bots. And bots actually even represent a good starting point for getting language-based information into the system. If you can’t beat ’em....
Where to start? I could start with memory management... That’s a big job that needs doing. The goal might be to change the way that concepts and conchainnodes are allocated and destroyed so that the program takes less memory an runs faster. Currently, everything is malloc’d as the program runs, and retrieved at the end of a program cycle.
Character strings likewise need to be created and reaped. I was thinking that I could prepend the string length to each malloc’ed string so that I hmmmm..... Don’t care about the length actually.... I guess I could prepend a link count.
February 4, 2005
On the way back from Vegas.
I decided that memory management can wait. I am tossing all of the nasty, complicated pattern matching out in favor of a simpler streamlined set. The idea is that something like AIML could be used to preen the input into simple English. Then Brainhat can be an inference engine or dialog manager with having too too much complexity.
February 13, 2005
Tomorrow is Valentine’s day, and for the first time in probably 25 years, I’m not on the hook to come up with some crappy card!
I got frustrated with the time it was taking for Brainhat to start up. I did some profiling and discovered that there was a problem in pass1 and pass2 that made the program spend forever trying to parse nothing. I fixed it. The startup time dropped by a factor of 100 to about 2.5 seconds.
I’ve cleaned out the words file--it’s empty. Everything I need to run the tests is in required-patterns. Likewise, I revisited all of the tests.
Now, a slight digression: Rick and I are trying to buy back Atlantic. If we succeed, we are going to have to come up with a new NOC infrastructure. A lot has changed over the last few years in the security business. One thing that customers are looking for is event correlation. Think of that! My very first job was event correlation software for nuclear power plants!
At that time, we had a facility that we called the critical functions monitoring system or CFMS. It had "intelligence" cast into FORTRAN nested if-statements. What if we could explain the critical functions and all associated inferences in English? Cool.
August 20, 2005
Man, did we get fired!
Rich and I have been working on peoplething. It is time to come up with a conversational component. I had some crap hacked up, but decided I’d rather suffer the pain of using Brainhat. So, I’m here to take a look.
The dialog we need includes stuff like:
system: "where are
you?"
user: "at chilis", "at chili’s",
"chilis"
system: "the one near the mall?"
user: "no, the other one" "no"
"yes", etc...
And so on.... First, I have to depend on the system’s ability to do wildcard inputs, like we did with Yogendra Jain. Second, I need an mproc to handle the case where someone mentions a ’thing’ and the ’at’ is implied. I guess that part and parcel to that, I need to be able to look stuff up from the peoplething alias data table. Also, I have to be able to handle situations where I don’t know what a place is...
September 7, 2005
I have given it more thought and had some discussion with Rich. I am going to use Brainhat as a one-way filter for the time-being.
media --> brainhat [mproc/robot] ---> rich’s stuff <--- intermediate thing <-------
This will mean: 1) building a grammar for the things I expect Brainhat to be able to understand, including idiom processing up front, 2) building an mproc or using robospeak (?) to contact and forward info to rich’s stuff, 3) possibly a mechanism for Brainhat to respond to basic questions in the event that Rich’s software doesn’t return anything, or in the event of an error. The portion labeled from Rich’s stuff and formatting it.
Some phrases:
I am at X I am here I’m here At X at x’s going home done busy with someone tell X Y I’m busy go away out in I’m out fuck you tell her i’m busy ask her what she is wearing what profile which profile describe her where? I’m in the bar
September 20, 2005
So, I now have some grammar and sloppy matches to handle some of the above. I am going to create an mproc to look for the appropriate CCs and return them on stdout or something as commands to pass to peoplething. If I return the results on the command line, I’ll preface them with a #. Anything that comes back without a # will go back to the user.... Some of the pertinent data, e.g. the user id, will not have passed through brainhat, and will have to be reassmbled into the command. More details when I try it once or twice...
September 20, 2005
Brainhat is no more than a filter in the grand scheme of things... (peoplething things). In mediahandler.c, I get something from the user and pass it through Brainhat to see if it is a declaration of location or whatever. I will despatch the results as described above--to the user if not preceded with a #, and to actionUser (rich’s interface) if a # is present.
The next issue has to do with what to do if actionUser returns looking for diambiguation from the user. For instance, let’s say that the user say "fred’s". Rich’s database may come up with 0, 1 or many matches. If there is 0, then I need to tell the user that there’s no Fred’s. (or ask something else like "where’s that?"). If there is 1 then I will go back to Rich’s code with a location.known. If there are many, then I need to go back with a list and (perhaps) ask the user to choose by number. The last one is problematic because I am now in charge of keeping state for the user. That is, I need to map the numeric response to a list that isn’t anywhere anymore...
Ah, heck. If I come back with ’many,’ I’ll just make them enter the name over again to match on of the one’s on the list. If I come back with zero, I’ll ask ’fred’s: "what is that near?" and use the response as the start of the next location.told cycle.
The other place for state is when we are relaying messages. "There is a woman near you: bustyb" "Tell bustyb hello" "Tell her hello" or "Connect us"
March 19, 2006
I am working with Rich to ressurrect the database version. We are trying to use pass1/pass2 to send the data directly to the database for Rich’s consumption--in lieu of the java stuff we used til now. Brainhat will do a run-once to populate the symbol table. At the same time, it will pass to Rich the data from the input as name/value pairs.
The routines I will call look like:
bhatid = int createconcept (name, collection) updateconcept (int bhatid, char *name, char *value), where name/value are the left and right columns. Also, somehow, I need to keep track of the ’collection’--words, rules, edc... Not sure how to do that.
Is this enough, me asks. Did we forget something?
March 17, 2007
I have wild ideas for Brainhat.
I’ve also distanced myself from Rich due to his continued friendship with my piece of shit cheating-ex girlfriend and her ugly fiance, so the database is out.
Briefly, I think it would be interesting to cut brainhat into pieces so that the part that runs inferences can run by itself inside broadcast domain devices that share things they learn about the world. Long ago, I changed the way Brainhat ran inferences from one of identifying antecedents of inference templates as CCs to muttering to itself in self-talk. This was kinds of cool because the self-talk cold create results that you might not get if you used CCs. Puns for instance....
Anyway, for a portion of a brainhat to run efficiently inside of other devices, it’ll have to be less of a CPU and memory pig, so that rules out the language parsing bit. It may even rule out brainhat as written; I might need to do something new.
I also probably want to clean up Brainhat as is. One thing I started and never finished was documenting normal forms for CCs. That will be important. Also, since these ’agents’ would have context, I need to finish thinking about how they can update context, correct context and forget things.
I have to clean up the grammar too; making it understand english with all it’s idioms is too difficult with first principals. Even the human brain doesn’t do it that way--doesn’t take a statement at face value and then back-pedal to figure out what it really meant. It’s a good job for AIML.
So much to do...
November 2, 2007
I was working on some other stuff that nobody cared about, so I decided to work on this thing that nobody cares about. Besides, I broke my leg.
Nov 3, 2007
I was ripping out all the ugly SQL stuff last night... I eventually broke Brainhat when I pulled the meme stuff. I decided to start over, but be a bit more gentle. Memes could exist without the database. So could meme maps and meme shifting. I wrote code to parse the maps at some point not so long ago. Anyway, I will proceed more gently.
Nov 7, 2007
There is no treatment of pronouns in uttered output to make Brainhat say ’him’ in lieu of ’he’, or ’her’ in lieu of ’she’. So, we get output like "mario is near she." I think that if I indicate that I have passed the verb, then utter_pronoun can take the queue and issue the pronoun in the correct form. Experimenting....
wondering what I broke:
Brainhat, Copyright (C) 2002, Brainhat LLC, Version 2.051001. >> mario sees the princess mario sees the princess. >> does mario see her? maybe. I do not know. >>
Nov 8, 2007
Reguarding run-on prepositional chains, ala: "mario near luigi near mario..." Maybe the fix is that the objects of the preposition be prohibited from expressing their own prepositional attributes? Easiest way to do that is to create utter_thing_noprep or to modify utter_thing so it knows whether to voice the prepositional phrase.
That worked great. I created a second routine though.... that’s inefficient. Perhaps I should add a flags variable to the end of the "utter" routines to avoid duplicating code or reverting to globals.
Used a global... sorry
Looking at test016.t. Tobe fails to come out in the right tense in some cases. The routine getbesttense in rels.c looks like the culprit. All I have time for now... sleepy.
Nov 10, 2007
Ugh.... so much computation... making debug difficult. So, I step back and ask myself: "Does there have to be some memory allocation and computation going on? Is something broken?" To answer, I am going to step through the processing of one question: "is the ball red?" With no other context, this causes 9 passes through utter_tobe. Why?
The question is fielded by question-decl-3.
/* is the ball round? */ define question-decl-3 label question rule $c‘tobe‘0! $r‘csubobj-q‘1! [$c‘not‘3! ]$r‘ccattr‘2![?] map VERB,SUBJECT,AREQUIRES,WANTS mproc ADDTOQS mproc READDREQUIRES mproc SPEAK mproc YESNO mproc PAREQUIRES mproc PULLAVTNS mproc ORTHANSWER mproc CHOOSEONE mproc ORTHFAVR mproc PULLATTRS2 mproc ATTRQUES2 mproc QUOTQUES mproc GRAPH mproc TOBECOMP2 mproc PUSHTENSE mproc PULLTENSE mproc HASHREQS
At hashreqs, I have two statements--one for a toy and the other for a party. I end up in hasreqs three times, though--before I even get to pulltense. Why?
>> is the ball red? find: rule subobj1 matched: the ball red? find: rule subobj-q-1 matched: the ball red? Break in hashreqs at the start: A conchain is available. debug> find: rule subobj-q-1 matched: the ball red? Break in hashreqs at the start: A conchain is available. debug> find: rule csubobj-q-2 matched: the ball red? find: rule cattr2 matched: red? find: rule ccattr2 matched: red? find: rule question-decl-3 matched: is the ball red? Break in hashreqs at the start: A conchain is available.
It appears that we are getting matched by subobj-q-1 and csubobj-q-2. Okay, these are invoked by question-decl-3. But, why is hashreqs being called thrice? That is, why is hashreqs being called on the component matches?
debug> sdump subobj-q-1 define subobj-q-1 [16727] label subobj-q label subobj-q-1 rule at 87b41c8 map: 44 9 63 9 0 0 0 0 0 0 mproc hashreqs mproc lowpronouns mproc qacandsall mproc orthelim
Interesting... the above shows subobj-q-1 as pulled from the symbol table. The following shows it as taken from input-patterns. The question is: where’d the call to hashreqs come from?
/* e.g. "the big dog" */ define subobj-q-1 label subobj-q rule [$c‘article‘0! ]![$r‘cattr‘1! [$c‘adjective‘3! ]]$m‘things‘2 map ARTICLE,AREQUIRES,ROOT,AREQUIRES mproc ORTHELIM mproc QACANDSALL mproc LOWPRONOUNS mproc HASHREQS
Oh my! There it is! But why?
Looking closer... its all okay.. hashreqs is used to fix the AREQUIRES tags. Moving on.... still trying to justify nine calls to utter_tobe.
Routine dospeak cycles through each of the ’utter’ routines, allowing them to return a MATCH/NOMATCH. This explains why utter_tobe could be called even though it produces no output. However, my 9-count comes from within utter_tobe, after the test for a match.
back to debug, then....
mario is red
mario was blue
is mario red?
result is coming back (test016), "yes. mario is red is
blue."
Ugh... still a huge number of calls to utter_tobe. These are coming from calls to descon in tests.c Maybe it is because of the journal?
That was it! Now it’s sane.... One call to utter_tobe per output. Of course the answer is still wrong, but at least I can see what’s going on through the noise...
"mario is red is blue"... The problem might be that the question is asked in the present tense. There’s good agreement between the tense of the question and the tense of the attribute "red." However, there’s a toss-up when it comes to "was blue" and "is red"; the tenses don’t match, though the number and person do. Accordingly, "was" (tobe-1) and "is" (tobe-2) score the same and the program just picks one. Question is... how to bias in favor of the tense of the attribute?
If I pass the attribute and the verb ’tobe’ down to getbesttense, without the root concept, I at least appear to get the past tense correctly, but the number and person are now missing because they aren’t tagged onto the attribute. What to do? I could clone and copy them. Or, perhaps getbesttense is used so narrowly that I could make the appropriate bias for the tense on the attribute within the code.
Hmmm... getbesttense is used in a lot of places.
I created a little mess to promote the TENSE from the ATTRIBUTE in utter_tobe_tense to a cloned copy of the concept. No ill effects yet.
Argh! What’s this? I can only set one location breakpoint on a routine? I tried break pause loc1 and break pause loc2. The second one (whichever it is) takes, and the first is ignored. Bit rot... Must fix.
Next project: test018 has weird output.
red mario is happy. mario is glad. >> blue mario is sad. he is sad. >> is red mario sad? no. mario is sad is blue.
It should say "no. blue mario is sad"
For tomorrow or so...
November 15, 2007
I want to do better than this:
>> what color is the square block? Break in speak at the start: A conchain is available. debug> dump define Root-831d [999966434] label question-what-4 quote Root-831d person third number singular tense present requires color-8370 verb tobe-8079 subject block-7f4c debug> cont output by: utter_tobe the block is red is square.
I can see that the REQUIRES tag is looking for color. I should pull any attribute that is not a child of color and mutter it as part of the subject. That way, this would say "the square block is red."
I could also contemplate conjunctions and commas in the output.
November 21, 2007
Fixed some of utter_tobe. There’s an interesting issue with the following two statements: I don’t think that they are ending up in the same form, but I have to get to work...
"the block is red"
"the block is round"
versus
"the red block is round"
(later)
This is from the former:
define block-7f49 [6604] label block label block-1 attribute round-800d attribute red-7f47 article the-7f4a typically square-1 wants size-1 wants color-1 child-of toy-1
This is from the latter:
define block-7f4c [6604] label block label block-1 attribute red-7f4a attribute round-7f4d article the-7f4e typically square-1 wants size-1 wants color-1 child-of toy-1 Looks the same. In utter_tobe, I get following form of block as the subject tied to the root: define block-7f49 [6604] label block label block-1 child-of toy-1 wants color-1 wants size-1 typically square-1 article the-7f4a attribute red-7f47 attribute round-800d requires color-82de requires round-800d
case:
I get this pattern for the second:
define block-7f49 [6604] label block label block-1 child-of toy-1 wants color-1 wants size-1 typically square-1 article the-7f4a attribute red-7f47 attribute round-800d requires round-800d requires color-8166
The difference appears to be in the attributes assigned to the block. In the first case, the attributes "red" and "round" both appear with a TENSE tag that says "present." In the second case, only the color "red" appears with a TENSE tag. So this is a problem with an input pattern. Searching now...
The pattern is sent-declare-attr2 in either case... looking deeper.... csubobj4.... cattr2. There’s no place where the tense is being attached to the attribute associated with the subject, but appearing before the verb. Should there be? What would break?
It looks like routine pushtense is supposed to do just this, and it is an mproc for sent-declare-attr2. I need to see what’s broken here...
November 23, 2007
pushtense is working okay, as written. The issue is that this is the CC that gets generated for "the red block is round":
debug> break pushtense all debug> cont >> the red block is round Break in pushtense at the start: A conchain is available. debug> dump define Root-7f35 [999967434] label sent-declare-attr2 person third number singular tense present attribute round-1 verb tobe-2 subject block-1
The other ATTRIBUTE, "red", is glued to the SUBJECT, but pushtense has no way of reaching it. Furthermore, should it? Have to take a shower and think about it... The ATTRIBUTE "red" is glued onto "block" without any TENSE. Does that make it open season push the present tense onto it?
Ugh.... Ian burned a bag of popcorn--my recipe. The house really stinks. Ian and Ross were here long enough to play video games and watch a movie.
I am going to try the above. If there’s a tense already, I won’t create a new one, of course.
Okay... that works. Now, test19 looks a lot the same. But it answer stupidly too:
>> red mario is in the water red mario is in the water. >> where is red mario he is in the water is red. >> where is red mario mario in the water is red.
Digging...
The generated CC looks right. It must have to do with the output routine, utter_tobe.
Here’s another:
>> a man talks with the ball a man talks the ball with something.
Fixed. There’s some anxious interplay between sent-action-4 and sent-action-4a.
I just ’fixed’ something else, but I don’t like the fix and it’s not making things right. Take a statement like "mario tells luigi that the princess loves the block." Luigi is the indirect object. But what is an indirect object? It’s the same as an prepostional phrase used as an adverb. Read: "mario tells to luigi that the princess loves the block." Is there any use for indirect objects? I have to look around and see where I’ve used them. Well... they’re used in several places... mostly imperatives. I could write a little mproc to turn them into prepositional adverbs and be done with indirect objects forever.
December 2, 2007
I will get back to the indirect objects... but, let me see if I can organize my thoughts about future directions for Brainhat.
• Identify the four cases of normal forms for all possible inputs
• Debug and streamline the existing grammar
• Revisit memory management
• Implement some kind of tagging of statements in the context so that the context can be revised as new facts supplant old ones, or old ones are found to be untrue.
• Implement some kind of temporal tagging so that ideas can age out.
• Revisit memes revived from XML
• Make the user interface and debug easier
• Hone the robot interface
• Look into an AIML-like front-end for idiomatic expression
• Ressurect the Brainhat-to-Brainhat communication capability and add broadcast, unknown concept queries
December 4, 2007
Screwing up sent-action-10. With or without the indirect object, the pattern has to be very specific with respect to the indirect object. The ’to’ of the indirect object has to be present or a statement like: "mario sees the ball" fails as the pattern tries to consume "the ball" as the indirect object, and then runs out of stuff....
One (ugly) fix is to precede the pattern with one that requires an indirect object. That may be today’s fix....
December 24, 2007
The block is red. Why are there so many ’dirty’ concepts?
define Root-7fb9 [999967341] label sent-declare-attr2 child-of things auxtag no-object-context hashval 401 person third number singular tense present attribute red-7fba verb tobe-7fbb subject block-7fbc
Above, ’tobe-7fbb’
needn’t be dirty. Likewise, the article
’the-1’ for the ball (not shown) is dirty.
(nevermind....) Upon first use, each concept is cloned into
the context whether modified or not. This kind of wasteful
of space, though...
Been playing with state saving too. Assume that is a all screwed up and changed from what it used to be. I will make i work in some fashion.
December 25, 2007
Merry Christmas!
I have been working on state.c, trying to make as lean a state file as possible. The only one contained full sentences. I want one that needs no natural language parsing, and is small. This is what I came up with:
[context] 0 142531880 ’define Root-8044 [dirty]’ 3 1040 ’child-of things’ 12 2813 ’subject princess-8045’ 16 9101 ’verb to-8047’ 14 6558 ’object ball-804c’ 33 1833 ’tense present’ 42 1896 ’person third’ 2813 142527240 ’define princess-8045 [dirty]’ 3 1944 ’child-of female-1’ 3 1111 ’child-of woman-1’ 2482 142482656 ’define the-8046 [dirty]’ 3 1800 ’child-of definite’...
This is really test output, actually. Ideally, all that would be requires is a string of ints. One additional requirement though.... the tag numbers for all the concepts, e.g. ’ball’ or ’princess’ have to agree. I suppose that is one brainhat handed another brainhat a reference to a concept it didn’t know about, it could look it up. I have already considered this possibility in my minibrainhat notes.
Was looking for a problem with context saving code. I wouldn’t save the ’speaker says’ bits. I found that spkrsays didn’t set the ’rw’ bit for the new concept that says "speaker says...." So I set it. Looks okay now. Don’t know about side-effects.
Hmmm... different issue now... I say "if a man sees the princess then a man loves the princess". The top-level (CONDITION/CONSEQUENCE) doesn’t get put into the saved context (presumably the rw flag is false), but the components of it do... I think I need to look for AUXTAG no-object-context when I save these bits.... and maybe pass it on? Bigger question: am I writing to save context? Or am I writing to share info with other brainhats?
A couple of beers later... I am really writing for two different reasons: one is to save context, completely, so that I can use one instance of Brainhat to compile for others and so that I can compile for smaller, more efficient distributed inference engines.
The other reason for the efficient compilation is so that I have a prototype mechanism and a base-capability to feed information to other brainhats or to inference engines.
The requirements are different. In the first case, I need to save all of the context, including inference templates. In the second, I think I want to pass along context to the exclusion of inference templates. Consider that inference templates are programs, and distributing programs as data might be a bad idea for security reasons (but it sure is sexy to be able to do so, huh?).
December 26, 2007
Hmmmm... I’m making a mess. I need to define the minimum possible, and I want to do it in pair-wise tuples. Try this:
0 tag
means define some new concept using the sender’s
’tag’ as a way to identify it as new (in other
words, dirty), but not expecting the tag to be any more than
a unique identifier to the receiving copy of
brainhat--probably derived from the memory address on the
sender’s machine. The receiving brainhat
shouldn’t use the tag; it might conflict with an
existing concept.
linktype tag
indicates a link to a concept with the supplied tag. it
could be a dirty/new concept, or it could be something from
the basic pool.
In the definitions (the first case), the real tag number has to be present in the definition so that the receiving copy can identify the concept as an instance of something it knows, if appropriate.
I believe that I only want to define/link to new concepts when they are ’dirty.’ The ’just cloned’ concepts are as good as new concepts from the point of view of the receiving brainhat.
Here’s the coolest thing... once this info is sent, there needn’t be any language on the far side; we are computing with knowledge without reguard for language!
December 28, 2007
Here’s "man sees woman."
define Root-7fe5 [999967278] label sent-action-10 child-of things auxtag no-object-context hashval 247 subject man-7fe6 verb tosee-7fe7 object woman-7fea tense present number singular person third define no-object-context [2622] label no-object-context define man-7fe6 [1087] label man label man-1 related woman-1 person third child-of men-1 define third [1896] label third child-of person define tosee-7fe7 [6908] label to see label tosee-1 subform saw-1 subform sees-1 subform see-2 subform see-1 child-of sense-1 define woman-7fea [1111] label woman label lady label woman-1 related man-1 person third child-of women-1 define third [1896] label third child-of person define present [1833] label present child-of tense define singular [1909] label singular child-of number define third [1896] label third child-of person
How should this look s.t. it can be saved or shared? Does my plan above make sense regarding tags?
define Root-7fe5 [999967278] label sent-action-10 child-of things auxtag no-object-context hashval 247 subject man-7fe6 verb tosee-7fe7 object woman-7fea tense present number singular person third 0 999967278 ’define Root’ 19 999967278 ’tag 999967278’ (unique local tag to be asgn’d by rcvr) 3 2093 ’child-of things’ 12 999538434 ’subject <something1>’ (ref tags made up by sender.... 16 999837420 ’verb <something2>’ ...rcvr will find ’tag x’ 14 999637344 ’object <something3>’ inside defn to identify these. 33 2045 ’tense present’ 38 1022 ’number singular’ 42 564 ’person third’ 0 999538434 ’define man-1’ dirty definition 19 2031 ’tag 2031’ tag number for man 3 1002 ’child-of men’ etc...
January 17, 2008
I think I have finished with state saving stuff. Now, I will try to restore into Brainhat.
January 28, 2008
It almost works. Something troubling though... I don’t understand it. It looks like I have copies of stuff that’s supposed to be unique. Here’s a block. I added something to conprt to show the memory location:
define block-7fbc [6604] (142577456) label block label block-1 attribute round-8080 attribute red-7fba article the-7fbd typically square-1 wants size-1 wants color-1 child-of toy-1 define block-7fbc [6604] (142548792) label block label block-1 attribute round-8080 attribute red-7fba article the-7fbd typically square-1 wants size-1 wants color-1 child-of toy-1
You can see that these identical blocks come from different memory locations which can only mean that there are two copies. This is troubling; should be just one...
December 18, 2008
Well, did I say I’d get right back to you or not?
I made arrangements with Dan Collins to fold the corporation. We voted Rich Foster out, and compensated him duely ($1). Brainhat, Inc. will fold. I will take the IP, and agreed to give Dan some net profit (should there ever be any) in return for his investment in the project.
Regarding the issue above, the code following these comments in find.c explain it.
/* Append concepts to the root concept. Unfortunately, this code only lets me stack them two-deep. */
I am cloning everything! And, thinking about it, it makes sense. I would otherwise be modifying stuff that lives in the context, even if I was simply making an evaluation copy.
In addtocontext, I look for duplicate CCs. I need to be running behind (into the past) in the context and replacing individual concepts with more endowed versions of themselves, as long as they don’t cause Orthogonality violations.
But, I do think I’ve thought this through in the past, and I think there was a ponder routine whose job it was to go fix up references so more embelleshed concepts would replace their poorer cousins in the context. I am going to take a look at the ponder routines now. Be right back...
Found it! The routine is called updatectxt. I wrote it nine years ago or so. Now I have to see why it’s not doing what I need it to do.
Here’s a clue! I removed updatectxt from processing six years ago, as demonstrated by a comment and code in doponder. Going to go look for the reason....
The notes from July 1, 2002 explain the issue. Basically, updatectxt was being too simplistic. The recipe for the fix is in the July 1, 2002 notes. I’m diving in.
Double-hmmm... updatectxt does appear to be getting called....
More digging and a little debug... It appears that updatectxt is doing what it’s supposed to be doing: it is updating the links of old concepts. The problem is... it should probably be swapping the new concept for the old one and leaving the old one to be deleted (somehow). I can’t imagine that I didn’t try this before, but the issue would have been different. Now, I am trying to make a super-compact knowledge representation. Before, I might have been interested solely in a coherent, up-to-date context--from a knowledge point of view.
Must think about this... and poke at some code. Addtoctxt looks for children of the concept under scrutiny back in the context and replaces the links, as long as there’s no failure of orthogonality. This might be wrong... I can’t imagine why a use of "toy" followed by "block (red)" might not end up with a bunged up context... (testing that now)....
Hmmm... seemed to be right, ala:
the ball is red. the toy is in the water. is the ball in the water? (yes)
I could swap concepts in updatectxt, but that would be the wrong thing to do unless they were the same concept. That is unless they had the same tag number and were orthogonal.
So, I am going to add a little code to updatectxt to swap out concepts if they have the same tag number and pass the orthogonality test. If they don’t have the same tag number, then parent/child updates by virtue of sharing links (as currently) will be the action taken.
Thinking about it some more... I need to be mindful of TNP and other crap too, probably.
Airplane is bouncey. I am on my way to Chicago, trying to beat the snow, to do a wireless installation on Saturday in Wisconsin. It is Thursday. I had a guy lined up to do the job, but he bailed out on me.
December 21, 2008
Gar! That was a long trip. It’s been snowing everywhere. Making changes to updatectxt.
It certainly improved things for the test case.... I don’t know what it broke. But the fix isn’t absolute; in a case dealing with red blocks and princesses, I still see clones of the block where there needn’t be any. Perhaps I’m not going deeply enough into the concepts on the context conchain. Particularly, it might have to do with ’speaker says’ concepts. Need to investigate. Have to shut off computer. Plane landing.
December 29, 2008
I removed all the ’speaker says’ constructs from the saved state. It looks pretty efficient. Amusingly, though, "the ball is red. the princess is happy." results in a binary save file of 368 bytes.... which is more than the text (sigh).
Where was I? I think I was working on making little Brainhats to communicate with each other and attain some level of gestault. An amusing thought:
•If a brainhat learns something from another brainhat that is already knows, then it can forget it. That’s because at least one brainhat has the knowledge, so why duplicate it?
The next step in my experiment is to create (or rediscover) an interface though which one brainhat can broadcast to another... and all others. Essentially, I’ll be waiting to restore the context of whomever cares to share theirs. Shall I use the leftover meme tagging so that new bits of context get their own meme and get conditionally intertwined with the saved (meme 0) context? What would a meme map be for something I wasn’t prepared to receive....? Hmmm... Interesting and fun problems.
I am going to work on the interface for receiving broadcasts.
I will assume for the time-being that transmissions of broadcasts take place only at start-up, and when an inference generates some new knowledge.
(later) There’s some code that Rich wrote that doesn’t do anything. It looks like it is supposed to allow other copies of Brainhat to send each other messages. I’m going to rip it out. The robot interface is still there if anyone wants to send text.
January 4, 2009
I am going to make the routines savestatetxt and restorestatetxt sub to another routine that can send and take the binary data from anywhere. These can then be linked to either of files or a broadcast domain of some sort. Routine savestatetxt already calls a routine rdumpcon2 that saves each of the CCs in the context.
Creating a check point here....
February 10, 2009
Like a lot of other little projects, the one above was advanced, but I didn’t finish it. I plan to come back to it later. In the interim, I have been looking rewriting the grammaar, adding arithmetic functions and tailoring the whole mess for robotics. It would be both straightforward and large undertaking at the same time.
In a sense, I’d be reducing the unbounded grammar of human language to a smaller grammar for programming. This would help make Brainhat both more useful and more deterministic. There’s a lot more to the project. At the moment, I’m making a plan for tossing out and rebuilding the grammar with essential elements only.
Grammar (for a start):
A verb B, where A, B are children of things, possibly subjunctive clauses, and verb is a transitive verb in one of a number of tenses, including present, preterit, future perfect, future imperfect and possibly subjenctive.
A verb, where the verb is intransitive, null element implied.
A tobe C, where A is a child of things and C is an attribute, including possibly prepositional phrases or numerical/logical expressions (!).
A tobe B, where A, B are children of things, possibly subjunctive clauses.
Then negation in each of the above. When I get past that, I’ll re-add conditionals, imperatives and questions.
Other things to do: For each CC, I need to keep a list of every condition that led to its existence, so that if the condition is rescinded or aged, the consequence can suffer the same fate. I need to add primitives for manually aging CCs into the past tense as well. This could also allow for inference templates to be reset.
"The mind of your robot"
February 14, 2009
It is Valentine’s Day--the worst day of the year. In 2005 I wasn’t anyone’s property, so I celebrated VDay with a scraggly beard. I think I got pickled too, with Carla perhaps. Though, she probably saw things differently than I. Today’s enslaver is Mary Kay. She explodes at the drop of a hat on a normal day. Today, I feel like I might just as well push all of her buttons at the outset and get it over with. Flowers, chocolate, dinner, an insincere expression of love... and then the mention of a previous lover or someone who I think has a crush on me. Good plan... I’ll have the rest of the night off.
February 19, 2009
I am in puerto escondido, oaxaca, looking out the window. I came here on the recommendation of Meg’s brother. The place I am in is a little bohemian--it’s a hotel/youth hostel. There was street noise til about 2 am. I’m okay with all that except that 1) I am old for the crowd in this hotel and 2) I need to keeo working. There’s wifi in this hotel, but it doesn’t reach my room very well...
Ostensibly, I am stuck in Mexico for a week. I need to find a reliable connection and a rhythm. I could leave here and go back up-coast too; there are other places to stay.
I went out and got a cup of coffee from a local restaurant, looked around the hotel, bought some water. I think I am feeling a little better about things.... Just needed to feel like I would be able to get some work done! All that’s left to do it figure out how to afford retirement and college educations for the boys....
February 21, 2009
Yo estaba aqui por tres dias. My computer has been locking up. Gar.
So anyway, I need to ask myself what I would do with Brainhat. I’m not making much progress, that’s for sure.
February 23, 2009
Main Street in Puerto Escondido is really pumping. It’s 11:30. There was a parade; people were out. I’ve been drinking too much beer, and am not feeling terribly healthy.
Thinking about a separate grammar for Brainhat. I could leave the old one alone, but perhaps set it aside for the moment. This other grammar would accomplish the same basic tasks, but be more precise and have some procedural capabilities. The old grammar would be able to run at the same time. But, I will set it aside for the time being. Maybe. Gar.
Main street is really pumping... I have had enough of Mexico for the time being.
Definition statements:
define xxx-1 {
child-of zzz; label xxx; (etc)
};
This should look familiar. It’s the grammar for vocabulary concept definition. It should be easy enough to make this active at runtime. The difference is that I can’t do two passes over the definitions at run time; everything referenced has to exist already.
Instance declarations:
new xxx-1 {
label freddy; (plus other directives from below);
};
Creates a dirty copy of xxx-1 and gives it the label ’freddy.’ All the other parts of xxx-1 are inherited, including the other labels.
Make a proposition
zzz-1 freddy {
child-of yyy-1; is attribute; was attribute; mightbe attribute; isnot attribute; wasnot attribute; mightbenot attribute; verb frankie; //transient verb null; //non-transient
};
The proposition is given the label zzz-1. The verbs are in either of the the past, present or imperfect tenses. The imperative is treated below. The creation of the proposition does not cause it’s execution, unless the execstatement is included therein. Execution is caused by the exec command. A proposition can be tested for truth value. It need never be executed.
Evaluation
eval qqq-1 freddy; eval qqq-1 (freddy); eval qqq-1 (freddy && frankie); eval qqq-1 !freddy; eval qqq-1 !(freddy || frankie) && ernie; eval qqq-1 rrr-1 && freddy; eval (rrr-1 && freddy);
qqq-1 takes a truth value. If no label is specified, the truth value is generated, but not stored anywhere. This can be used inside conditionals. The truth value of the evaluation is created at the time it is executed. A fixed value for an expression can be had by not reevaluating it.
Execute a proposition
exec zzz-1;
When a proposition is executed, it has side-effects: the result gets hashed and becomes part of the context, and may cause waiting inference templates to fire.
Make an inference template
inference {
condition {
eval (qqq-1) && freddy; };
consequence {
exec zzz-1; exec xxx-2; };
};
Examples
I am going to try a few examples to convince myself that this is going to be possible and elegant.
define mario {
child-of man; }; define princess {
child-of woman; };
new mario {
label mario-1; }; new princess {
label princess-1; };
condition-see man {
see woman; };
consequence-like man {
like woman; }
inference {
condition {
condition-see;
};
consequence {
exec consequence-like;
}; };
mario-sees-princess mario-1
{
see princess-1; };
exec mario-sees-princess;
February 24, 2009
Maybe that’s a bit much.
I am still in Puerto Escondido. Tomorrow is Ash Wednesday. Tonight, the same parade that rolled through town on Sunday rolled through again--young girls on decorated trucks, handing out beer and candy. Rrr. I leave in the morning.
Another iteration: definition and new as above...
Make a proposition
prop zzz-1 { mario child-of yyy-1; mario is attribute; mario was attribute; mario mightbe attribute; mario isnot attribute; mario wasnot attribute; mario mightbenot attribute; mario verb frankie; //transient mario verb null; //non-transient } [exec];
Make a proposition with variables
prop zzz-1 { mario child-of $1; $1 is $2; mario was $2; etc } [exec];
Execute a proposition
exec zzz-1; exec zzz-1(mario-1); exec zzz-1($2);
Make an inference template
inference {
condition {
qqq-1 && (freddy || !frankie); };
consequence {
zzz-1(); xxx-2();
};
};
Hmmm... The variable substitution is very insteresting. I will also need an indirect memory reference capability. This could get to be fun!
February 25, 2009
I am on the way back from Mexico, on a small plane. There’s a kid one row back with a video player. It sounds like he’s watching breaking glass and crushing metal.
The main datatype in Brainhat is the concept. That means that everything is a potential for variable substitution. The trick is to use it, and to figure out what to do when is doesn’t make sense.
Let’s make another example, including the new tweaks to the grammar:
define mario {
child-of man; }; define princess {
child-of woman; };
new mario {
label mario-1; }; new princess {
label princess-1; };
prop condition-see {
man see woman; };
inference {
condition {
condition-see;
};
consequence {
man like woman;
}; };
prop mario-sees-princess {
mario-1 see princess-1; };
exec mario-sees-princess;
How do we do something useful with the outside world?
// // Get a beverage // meme get_beverage {
attract {
person want drink;
(person have thirst) && !(person has drink);
};
inf {
cond {
...something...
February 26, 2009
Still being schizophrenic about this... spent the time on the plane from Houston trying to imagine programming scenarios that weren’t ultimately dialog based.
// This is the list of choices that will satisfy this meme, // used below:
$choices = {coke-1 beer-1 milk-1};
// make it possible to get attracted to the meme:
activate get_beverage($choices);
// Choose a beverage: This meme tries to elicit a beverage choice // from the speaker.
meme get_beverage($choices) {
// These are the propositions that will attract attention to this meme:
attract {
speaker wants beverage;
(speaker has thirst) && !(speaker has beverage);
}
// The goal is to decide what kind of beverage the user is looking for.
goal {
speaker wants oneof($choices) ||
!(speaker wants a beverage);
}
// The background facts
you have allof($choices);
beer is cold;
// Ask the speaker what they want to drink.
ask speaker what beverage does speaker want;
if
speaker asks what do you have
then
tell speaker you have allof($choices);
March 9, 2009
$drink_choices = [coke-1 beer-1 wine-1 milk-1];
meme get_beverage {
// Attractions
speaker might want a
beverage;
if speaker is thirsty then speaker wants a
beverage;
// The background facts
you have $drink_choices;
beer is cold;
milk is sour;
// Ask the speaker what they want to drink.
if speaker wants a beverage
then tell speaker that you have $drink_choices and
ask speaker what beverage does speaker want. }
// Alcoholic beverges
meme drink_alcohol {
speaker might want [beer-1 wine-1];
if speaker wants beer...
April 16, 2009
Save me Brainhat. Business has turned down sharply, my ex-wife’s alimony has bled me and I only ever have about a month’s worth of funds before I turf out. Rrrr. Wanna be my girlfriend?
Here’s a whole meme: Beer is cold.
Here’s another:
If speaker is thirsty then speaker might want beer.
Now, imagine 100s or thousands of these listening to chatter and responding if they have anything at all to contribute. "Speaker is thirsty..." could produce "the speaker might want beer", "the speaker might want cola", "the speaker might want milk"
There could be other little memes that say: "if the speak might want something then ask if the speaker wants something."
Then, they’ll hear:
"do you want milk?"
"no"
"do you want cola?"
"no. do you have beer?"
(speaker asks a question... clean out the queue)
"yes. i have beer."
"is it cold?"
Imperatives regaurding speech, pronoun disambiguation, etc, have to be handled by a copy of Brainhat that keeps context but doesn’t answer questions or run inferences. This would be a speech front end, of which there could be many.
If the speaker says that they want beer, then beer is going to the head of the queue next time... or statistically gets favored in some way.
My experiment:
Create x little brainhats with the following memes:
if speaker is thirsty then speaker might want beer
if speaker is thirsty then speaker might want cola
if speaker is thirsty then speaker might want milk
Plus, a brainhat that can talk.
Tell the brainhat that can talk that I am thirsty, let it pitch it out there and then see what the other brainhats do. It’s an academic test, but it’ll be interesting.
April 17, 2009
First things first: I need to change the existing brainhat code so that it can evaluate inferences without uttering them into english and then reparsing them. Ages ago, I used to evaluate the inferences as CCs. Then, I uttered the conditions in English and had Brainhat parse and evaluate them internally. I called it self-talk. The value of doing it this way was that I could take advantage of nuances in the interpretation of the uttered conditions--something might be taken to mean something slightly different than intended, and that might lead to interesting results. But, it sucks up memory and time. And, most importantly, i am planning to make copies of brainhat that don’t have any words associated with them. It would be very difficult under those conditions to interpret uttered conditions....
Some of declques2--the last part--can be deleted. There is code in there that looks for inferences to run in order to answer the question. The code is protected by a debug variable that is never used. I don’t recall the last time I used it. The current assumption is that the necessary inferences will run at the time they get into the context buffer, and that there’s no need to run them when a question is asked. Routine declques2 evaluates the CCs and doesn’t depend on parsing.
The routine attrques2 is skipped entirely by the setting (or non-setting, really) of QINFERENCES. The assumptions, as with the last part of declques2 is that the answer to the question will already be in the context, and that running an inference at the time a question is asked will be redundant and unnecessary. But.... I forget where an attribute-related question (e.g. is the ball red) gets answered.... Looking.
Oh yeah: the question gets cast into a statement where the attributes are replaced with requirements, hashed and compared against things in the context. It’s buried in the parsing. This represents the challenge: I need to be able to turn any CC that asks a question about some attribute (e.g. is the ball red?) into a CC with the right combination of REQUIREs tags to generate the right hash and score appropriately. As long as the questions and (and the answers) are stored in a predictable, normal form, this shouldn’t be too, too terrible. I hope. Anyway, I need some new code or to look under the couch for attrques or something else...
Once I have a working attrques to go with declques2, I can use these inside tests.c to evaluate inferences. So, the first order of duty is to look into creating a routine (attrques3(?)) that will evaluate questions about attributes as CCs.
April 21, 2009
This is what the condition of "if the ball is red then I am happy" looks like:
define Root-7fac [999967315] (142562072) label sent-declare-attr2 auxtag no-object-context person third number singular tense present attribute red-1 verb tobe subject ball-1
And here is the input pattern that answers "is the ball red?". Recall that once there is a match, we start with the bottom routine and work our way up.
define question-decl-3 label question rule $c‘tobe‘0! $r‘csubobj-q‘1! [$c‘not‘3! ]$r‘ccattr‘2![?] map VERB,SUBJECT,AREQUIRES,WANTS mproc ADDTOQS mproc READDREQUIRES mproc SPEAK mproc YESNO mproc PAREQUIRES mproc PULLAVTNS mproc ORTHANSWER mproc CHOOSEONE mproc ORTHFAVR mproc PULLATTRS2 mproc ATTRQUES2 mproc QUOTQUES mproc GRAPH mproc TOBECOMP2 mproc PUSHTENSE mproc PULLTENSE mproc HASHREQS
hashreqs changes the xREQUIRES tags to REQUIRES tags and pushes the xREQUIRES down onto the associated concepts so that they can be perserved. The xREQUIRES tags are needed to create a hash value for looking up a concept(s) that may be the answer(s) to the question.
pulltense pulls a copy the verb tense into the root.
pushtense pushes copies of the verb tense onto the attributes so that "was" (for instance) gets associated with the attribute. E.g., the ball was red makes a past tense copy of the attribute.
tobecomp2 hangs the attributes and any of its related tags onto the subject so, for instance, the ball now becomes a red ball. In other words, the question gets converted to "is the red ball red?" Or, perhaps, "is the red ball blue?"
quotques makes a copy of the question within the question.
attrques2 does nothing anymore.
pullattrs2 expands the number of CCs in the conchain so that each has one ATTRIBUTE attached. So, if when we reach here the question has become "is the red, round ball red?", the conchain will end up with two CCs: "is the red ball red?" and "is the round ball red?"
orthfavr sweetens CCs that will produce "no!" answers, but not to the level of something that will produce a "yes" answer.
chooseone throws out all the CCs except for one--the one that scores the best.
orthanswer adds a "not" to the top of the CC if the winning answer contains an orthogonality. E.g. "is the red ball blue." Voiced as an answer, "the red ball is not blue."
pullavtns (re)pulls a copy of the tense of the verb into the root. It pulls the tense of the ATTRIBUTE, if it exists, into the root as well, but attached it to a WANTS tag. This way, Brainhat can distinguish between tenses in the answer, e.g. "was the (is) red blue?"
parequires cleans the REQUIRES tags from the CC, cleaning it up so that the tags don’t get counted in yesno. Recall that the REQUIRES tags have already been pushed down to the elements of the CC by hashreqs.
yesno adds a "yes/no/maybe" to the conchain. The choice is made based upon how well the CC scores and whether a "not" is present.
readdrequires pulls the requirements back up after answering a question and before storing so that the nuance of the question can be perserved.
So, the purpose of this exercise is to see what I need to do to create a routine that produces yes/no answers to questions involving attributes. I need to look at some other examples--particularly where negation is in the question ("is the ball not red?") but I think I have what I need. If the CCs are in normal forms, then all I should have to do is invoke yesno on them(?) Once done, I should be able to eliminate self-talk as a method for evaluating inferences.
April 28, 2009
declques2 checks to see that propositions are in the correct form before trying to answer the question. There are some associated patterns in cc-patterns. There are also some patterns for attribute assignment from long-gone routine qattr.
I have to think about this just a little more. And, I have to get to the office now.....
April 29, 2009
define Root-7fbf [999967296] (142607904) label sent-declare-attr2 subject ball-8080 verb tobe-807f attribute red-807e tense present number singular person third auxtag no-object-context
This is the form that the CONDITION is in following substition and just before testing for yes/no value for the sequence "if the ball is red then i am happy. the ball is red."
I need to make the ATTRIBUTE into a REQUIRES.
Okay, the above scenario works.... There are a few more that don’t. I need to make "if the ball is not red then I am happy" work as well. And I need to add processing for orthogonal stuff. It’s all (gonna be) good. G’night.
April 30, 2009
selftalk is used in several spots to derive yes/no answers. I will take the work I started inside of tests.c and move it outside so that CC-based yes/no tests are available from anywhere as a replacement. The new routine will be called ccyesno, inside of tests.c.
May 6, 2009
Time flies. They’re fast.
May 8, 2009
Making progress. Have to drink beer now, though.
May 8, 2009
Working on ’if the ball is not red then i am happy. the ball is not red.’ The snippet here is from taken from yesno, just before it evaluates the answer to the the sequence It gives the correct answer. I need ccyesno to give the correct answer too....
define Root-80dc [999967011] (142615992) label question-decl-3 wants present emphasis not-8013 subject ball-8015 verb tobe-2 ignore red-1 wants not-8013 tense present number singular person third quote Root-80dc
Here’s what I find in ccyesno, just before yesno gets it:
define Root-7fd3 [999967276] (142618048) label sent-declare-attr2 auxtag no-object-context person third number singular tense present attribute red-8094 verb tobe-8096 subject ball-8097
The ball has an EMPHASIS ’not’ tag on it. It looks like the root needs a WANTS ’not’ in order to score well. I don’t know that there’s an mproc for this. The tag is normally inserted by the grammar. If there’s no routine, then I shall craft one.
May 11, 2009
"if the ball is not red then i am happy" "the ball is blue" This doesn’t work. I think it doesn’t work because the hash tags don’t cause the right comparison. Checking to see if it used to work.... hang on.
Phew! It never worked. Um. That sucks too.
May 12, 2009
I am starting to think that I am hacking senselessly. The tags balancing, the scoring--it’s all kinda ugly. I have to work on normalizing the forms and on answering questions with multilevel comparisons. Gar. I may abandon what I have been doing, restore selftalk and improve it, and try to clean up the forms.
June 1, 2009
This is a pretty big day. I handed the reigns of Halestar over to Phil and Rick, and i am not certain how I will support myself. I’ve really nothing left in the bank at this point...
Yesterday, i completed and submitted a paper for the Polish Chapter Thing Something On Natural Language Applications. It was done in LaTeX. I always wondered why people were so enthusiatic about LaTex, but it is pretty nice.
So, what to do... I always have a head full of silly projects, but I think the silliest one is Brainhat, so I am going to back to work on it while I prepare for the possibility of taking a real job somewhere. I was at the opening of the Connecticut Science Center on Saturday night. One of the things this new museum is missing is exhibits. It’s really boring looking. Maybe they need a magic box! A box that chats with you. Maybe everyone needs Magic Talking Box (sm).
There was a commercial on the radio in the early 80s, when Bob and Ray were a popular team on WOR in New York. The commercial was for National Semiconductor, I think. Anyway, they are doing a bit where one of them is talking about a new typewriter with a platten that is a mile wide. "Who would want that?" "Well, I don’t know, but I only gotta sell one..."
So, magic talking box....
Doing an experiment with malloc and free tonight. I have my own memory management, but there’s supposed to be a really good TCMalloc in glibc. My memory management prolly sucks in comparison. I am going to run some tests and see. One test will take a series of user inputs to see how long the program runs. The other will take the same input and stop to see how big the program’s memory foot print has become. I begin by cloning the Brainhat directory and making memory management changes.
Hmmm... I wonder if I wouldn’t be better just saving the context somewhere and calling a break() after a drunken dash through memory....
June 2, 2009
Looking at memory... I have these alloc links--a chain of allocation chaperones for each of the concepts, intlinks and conchainnodes allocated at runtime. Each newly created concept is paired with an allocnode. Only when the concept is made permanent does it lose its connection to the allocnodes.
allocnode allocnode allocnode -------->o-------->o-------->o---... | | | o o o concept concept concept
Saving a concept currently means searching through the allocnodes for the associated concept and cutting it out of the chain. The allocnode is returned to the spares and the chain is re-spliced. When it is time to "reap," the allocnode chain is rejoined to the spares.
One of my goals from way back was to reap more often so that the memory footprint didn’t swell and recede like an iron lung. I think that I wanted to make a link count to the associated objects.
Here’s my experiment: I get rid of the allocnodes. Instead, I link the concepts together directly through another element in the datatype. I also add a short link count to each concept. When it goes to zero, the concept gets deleted and the chain holding them together gets re-spliced. I will use free() and pitch the unused concept into the sea.
If it works well for concepts, I will move onto conchainnodes and intlinks. I have to think about what to do with text strings too....
media! there’s still an element of a concept called media! shades of mp3.... removed it.
Here’s the plan:
• change def.h; add forward
and backward memory links to concepts
• change memory.c:
• alloc concepts with
malloc() and string ’em on the built-in chain
in memory concepts that I just mentioned.
• yank ’saved’ concepts out (no searching
yay!)
• modify conprt() so I can see what’s
going on
This is a half-measure, really. A full measure would include some thought to making link counts go up and down as other concepts link to this one or remove their links. That’s coming, maybe... the issue is that I am not explicitly deleting anything. Rather, i am explicitly saving things. That means that there’s no opportunity to decrement a link reference count. Gotta work with save-only for now.
June 3, 2009
Works! Was a little bumpy... but it works. Now, I will do some comparison between old and new. And, whether it’s good or bad, prolly modify some of the other quantities--intlinks, etc., to use the new memory management scheme.
Test one.... just a little bit of input:
if a man has a girlfriend then a
man is happy.
if a man likes a woman and a woman likes a man then the man
has a girlfriend and the woman has a boyfriend.
if a man sees a beautiful woman then a man likes the woman.
if a person is near a thing then a person can see a thing.
if a man is handsome then the princess likes the man.
the princess is beautiful.
mario is near the princess.
mario is handsome.
Old Brainhat time:
real |
0m0.957s |
|
user |
0m0.868s |
|
sys |
0m0.032s |
New Brainhat time:
real |
0m0.782s |
|
user |
0m0.724s |
|
sys |
0m0.008s |
Now I am going to look at the amount of memory consumed.
Old Brainhat memory:
15544 14332 pts/0 T 12:50 0:00 ./brainhat -i /tmp/foobar
New Brainhat memory:
15404 14164 pts/0 T 12:51 0:00 ./brainhat -i /tmp/foobar
...about the same. The first figure is virtual size. The second is RSS. I imagine that they’ll diverge as the application grows. In any case, i am going to add new memory management to all the components that I can. I’ll add text too; text strings have been allocated and left laying around up to this point.
I’ll do intlinks next.
• remove
ilnk_spares.
• change ilnk_allocs to type intlink.
• change alloc_intlink.
• change freeall_intlinks.
• change keep_intlink.
• change reap_intlinks.
• change conrtns as appropriate.
Then conchainnodes, clinks.
I found this when I was making the changes for clinks:
keep_clink (p); // <----------- Imagine that! Left from when?
That is inside create_link()! That means that all
clinks are always saved. When I uncomment it, the program
crashes on the following sequence:
hi.
how are you?
what is your name?
if i like you then you like me.
i like you.
I am going to see if I can figure it out....
More debugging... found some
things. This sequence breaks things now:
how are you?
what is your name?
hi.
June 5, 2009
I was working on all of this some more.... I got through the intlinks and clinks with some issues--issues that needed to be feretted out eventually in any case. There was one vexing remaining issue. I think I found it, but I gotta go make a living right now...
This is a printout of "You like girls" from a copy of Brainhat where the clinks are not given back by an ultimate call to free(). When I uncomment the call to free(), printing this causes a segv on the second line--where child-of things appears. I think the problem may be as simple as that I am not explicitly saving child-of links. I probably wouldn’t do this because I wouldn’t want to follow these links recursively when trying to tuck a concept away in the context.
June 6, 2009
I fixed it... while drinking beer, nonetheless. Routine addthingtag, for making subjunction classes into ’things’, saved the subjunctive root, but not the link connecting it to the concept "things."
June 7, 2009
"If you want your product to behave like a human, it has to think like a human."
June 8, 2009
I am trying to figure out why memory management of conchainnodes causes everything to break. On the way, I am asking myself what happened to PARTIAL matches. The facilities are all intact, but I don’t see uses in input-patterns.
Got past the first issue with reaping conchainnodes. I’m onto the next.
I found a particular call to getconchainnode in find.c that was the culprit. I didn’t look into it more to see why; I just made it save the conchainnode and now it behaves. Time to try my memory/speed test again.
New Brainhat time:
Gar! Segv. More digging...
pcomp! Looking at pcomp, it sure needs to be renamed. It is the routine that processes an inference template, hashes it and stores the bits. Anyway, a conchainnode needs to be stored...
Gar! Another segv. It’s a problem with clinks that I missed the other day. Digging...
(later) Back in the game. It was links in pcomp. Running my experiments again:
Old Brainhat: real 0m0.971s user 0m0.880s sys 0m0.028s New Brainhat: real 0m0.813s user 0m0.708s sys 0m0.016s
... that hardly seems worth all the suffering. Let’s look at memory.
Old: dowd 22369 10.1 2.8 15544 14332 New: dowd 22377 12.6 2.5 14216 12964
All I can say is that I hope that scales!
I still need to look after text strings, but I’ve been in memory land for a number of days now...
Gar! Another explosion.... This time it was pushtense. Fixing....
Something to look into:
>> input Input file name: /tmp/input >> who does mario like cute mario near the beautiful princess likes the beautiful princess. >> does she like him? maybe. I do not know. >> who does she like she likes cute mario. >> why the princess likes cute mario because he is cute. >>
Next up: finish cleaning up memory management. I would like the REAP and KEEP debug variables to do something more predictable. There should be a "noreap" flag so I can debug memory management. The "keep" flag should spit out memory management stats.
Since it looks like I am keeping selftalk, make the buffers and the debug better. A debug flag called SELFTALK might be nice. Plenty to do....
June 11, 2009
This is probably someone’s birthday.
Made some changes to debug.
I downloaded a command-line editing library called tecla. It will allow me to have emacs editing on the input to brainhat for dialog and debug, I think. I would only plan to use it for stdin when it’s a tty. The challenge is that brainhat doesn’t sit waiting for input on stdin alone. Rather, it sets up a select and watches any number of fds.
Tecla is going to operate in cbreak mode, handling each character as it comes. I don’t want to hear from tecla until it has a complete input. Accordingly, I need the tecla code to run in a separate process. There’s a leftover reference to child sockets in main and debug. I am going to try to use that. Here’s how things might look:
stdin stdout \ ^ ^ \ / | \ / | v / | --------- | | tecla | | | code | | --------- | | | | | other FDs child sckt | | || v | vv ---------- | brainhat | ----------
Though... If it were more tightly integrated I could feed it hints (in lieu of tab-filename completion). Gotta think about it.
What would be most profitable to pursue at present? I could go through the test cases, seeing what has broken and making fixes.
Then I could get meme shifting working again, without the database. Or I could spend time on dealing with corrections to facts and their cascaded results. Or... I will fix broken examples and then get meme shifting working. It’ll will take the good part of a month, I suspect....
(later) Well... I only saw two problems in the tests: test064 and test 059. That’s not too bad. I am going to go on to memes...
June 12, 2009
In the summer of 2005 I was working to resurrect the database version of the code. The database code was written by Rich, and it required tables be initialized in a SQL database. Some of the code for initializing the tables had gone missing, so I wrote some code to read the appropriate files and hand them off to Rich. He never got around to it, so the code was never used. I still have it.
The database was kind of opaque to me. It’s a difficult enough set of concepts: the program would source vocabulary from the database (unnecessary, I think), and us it to cycle through memes in testing and execution. The actual code was impenetrable; Rich wrote ugly code that worked reliably.
I wrote all the code that implemented the focus shift in Brainhat, including intermixing memes with the context and then backing them out when the focus changed. I also wrote the code that ran inference templates dry (meaning, the inference could fire without causing any side effect). The only thing to do it find a different place for the memes to live.
Last year, I wrote code to save Brainhat state in binary form. This creates a very small footprint for a saved meme. The intention at the time was to use it for broadcasting between Brainhats. I could use that here. Or, I could save the memes in XML or some other text-based format (and be able to see them!) Or both!
The plan:
• Load the meme map
• Load the memes: As each meme is loaded, the hashes
for the statements and templates is stuck into the hash
table with the meme ID recorded (as before). • At
run-time, the hashes are used to shift memes (as before).
• As with my late experiements, it could be also
possible to add a little bit of Brownian motion into the
meme shifting to cause shifting when it gets boring.
• I also want to add a capability to the meme map so
that shifts that weren’t available at first become
available later, and shifts that were available at first
become unavailable. This would provide a way to bump a
conversation down the road.
I will begin by surveying the landscape and making data structure changes as needed.
Another thing that the database did was save conversations under a user name and password. That was valuable for conversations that took place over time, such as email exchanges. I don’t have a database now... but it would be nice to have that. Maybe I should use a database? I guess I should.
I would save the memeplexes for general use, the context for individual user/memeplex tuples. There should be a password too. All like Rich had.
This means that an mplex could be initialized into the database or from disk. If I treat the processing symmetrically--that is, I can do whatever I do from the files system from the database as well, then i can concentrate on functionality for now and add the database back later.
To prepare the memes:
If there is no meme-map specified, then the meme-map is assumed to include core:core only, which may be empty or non-existent. The source would, by default, be found in memes/core:core. If a meme-map exists, then it would be found in memes/meme-map
If the meme-map exists, then the components must be located and possibly compiled. Compilation will occur if the meme source (in the memes directory) is newer than the compiled version (in the data/memes directory), or if the compiled version does not exist.
At run time, the meme-map will be loaded. The dates/states of the compiled components will be checked. Any meme that needs compilation will be updated by forking a copy of Brainhat, ingesting the source and saving the compiled version into the data/memes directory.
When all the memes are compiled and available, Brainhat will load them and begin execution. It may also, depending on how invoked, reload the context and other buffers. At a call to bhatexit, Brainhat may save the current context into data/<token>, returning a copy of the token to the program or thing that involked Brainhat.
All of this is transferable to a database in a straightforward fashion. This will be a nicer environment than before because the steps of compiling and running won’t be disjoint, and there won’t be two code branches.
June 14, 2009
Still ploughing through. I simplified the memes/memeplex stuff....
June 15, 2009
I can load in the mememap. Now I need to glue it to the main body of Brainhat code.
Let’s take a walk through and decide what should go where....
Most things are initialized in main. Control is passed off to dialog. Then, depending on the mode of operation, the code either keeps going or waits for a network connection of some sort. Here’s a question: should I initialize the memes in dialog, after the fork occurs? This would allow me to load a baseline brainhat and choose what memes should run afterward. On the other hand, I do have code for restarting a running brainhat... For simplicity’s sake, I will have Brainhat load up the memes before entering dialog. I can always re-init downstream.
I am leaving vocabulary-based shifting out. It occurs to me that if an inference makes a meme attractive, then the vocabulary is probably in there. All the same, I may add vocabulary-based shifting back at some point. To me, the old code is impenetrable because it is both looking for words and populating the symbol tables from the database.
The old database code has a meme ID. This is still useful.
Among the routines that make sense from the old meme.c are (not listed by current names:)
• shift to a random meme, relative to the current location and restricted by adjacency candidates.
• initialize a meme. That means create a meme ID, check to see if the source and compiled versions are up-to-date, load all the stuff into core to make ready for a meme shift.
• shift to a specific meme, given its address in memory. This could be done via it’s meme ID too, if we were using a database.
• set real/test mode. this is necessary so that when we shift into a meme to run inference tests, we don’t cause any side effects.
• test adjacent memes to see if they are interested in firing. If so, maybe shift...
File state.c has the code for shifting memes in and out. Each concept has a meme id. It’s possible that it would be sufficient for each conchainnode to have it instead, I suppose. At any rate, I need to do a little bit of cleanup in state.c. I created code for reducing the context to a binary representation late last year. This will be the format for saving the memes as they get read in and shipped back out to disk (or a database, eventually).
There is a routine called restore that is missing its corresponding save. It must be in an old copy of Brainhat. restore recreated the context, saved question, everything.
The routines that save the very small binary forms of the context are mis-named savestatetxt and restorestatetxt, respectively. I will rename them txt -> bin. They don’t save or restore the "other stuff", the way the restore routine and the missing save routine does. Eventually, they need to be extended.
I am going to hack out the database routines. They spend much of their time looking for unloaded vocabulary. Vocabulary is small; we are going to keep all vocabulary loaded.
Changing the name of {save,restore}statetxt to {save,restore}statebin. I found an old copy of savestatetxt from before I modified it for binary saves. I will put that back into state.c in case I have a use for it.
Hmmm.... A binary restart:
dowd@compaq:~/brainhat$ ./brainhat -R idx1321 Meme Set Focus: Brainhat Core Brainhat, Copyright (C) 1990-2009, Kevin Dowd, Version 3.090615. >> am i happy yes. You are very glad. >> what do i see Segmentation fault
Got work to do. Gar. Oh! I know what it is. I have to save memory after loading the restart file. Be right back!
That was it! Woohoo!
Something I need to fix:
Break in debug at the start: debug> xspeak 1 You want mario likes You. You love the princess. debug> cont >> what do i want I do not know. >> June 16, 2009 I’m tired. I’ve had the boys the last two days, and I haven’t been smart enough to adjust my schedule...staying up too late, getting up like a normal human. Now, I am going to incorporate loadmemes code into the main trunk of Brainhat, including checks for conditional compilation. I will update the debug routines that allow me to switch between memes to verify that they load okay and that the context survives the switch. When that is behaving, I will address the automatic meme shifting routines... well, their rewrites, really. Ugh. nsymtab.c is polluted with ransom-note style variable names and vague references to "methods." Particularly, there is a routine called srchsyms that uses the database to look for a particular concept across all of the loaded memes. Since we are not using the database to locate vocabulary, I will strike this and its references in find.c and debug.c. Stopped in find.c... looking at $m (match across memes) pattern. Having a Halestar meeting. I can eliminate srchsyms. As it is, the $m pattern is treated like a $c pattern. June 17, 2009 Fix this:>> if you see a thing then you want a thing if I see a something then I want a something. >> you see a block I see a block. I want a block. >> why? I want a block because I see a block. >> you see a ball I see a ball. I want a ball. >> why? I want a block because I see a block. >> why do i want a ball You want a ball. really? huh.... >> why do you want a ball I want a block because I see a block.
More trouble:
>> you are a girl I am a girl. >> are you pretty? maybe. I. >> what are you she am a girl. >> are you pretty? yes. I.
Back to incorporating loadmemes....
June 18, 2009
I finished adding command line flags and routines for loading the meme map. The new meme-related routines are in nmeme.c. There are definitions for memes, memeplexes and candidates in the file loadmemes.h. These need to be incorporated into def.h and loadmemes.h has to disappear. Along the way, the existing structs for meme handlng have to be merged into the new ones or replaced. This will cause a number of routines to moan.
There is a ’short’ in the definition of concept called ’meme,’ which identifies which meme the concept came from. I am going to rename this to ’mid’ (meme ID), and throw out the old definition for ’mid,’ which was an intlink list of all the memes the concept appeared in.
I also need to add code for compiling and saving the memes.
It’s late.
June 22, 2009
Back to work...
Merging the structures from loadmemes.hinto def.h. Done. Also modified everything that looks at a meme ID. Recreated focusTest (now focustest. Have to compile memes yet.
(later) Got memes compiling, but not without some issues. I am getting a message of " referenced tag %d in %s not found in basic pool." for meme "food", upon restore. Meme "happy" works great. Need to debug binary storage routines in state.c.
Tired, though. Made fajitas with MK. Drank beer. Have lotsa farting to do.
June 23, 2009
Regarding the problem with saving and restoring, I looked at the meme that was giving me a problem and picked the culprit out right away: "your name is alisha." I am guessing that since "alisha" is a dynamically created label (or is it a dynamically created concept...?), it isn’t getting stored away appropriately.
’define Root-7ff0 [dirty] (157935496)’ ’tag 157935496’ ’hashval 232’ ’attribute alisha-7ff9’ ’verb tobe-7ff2’ ’subject name-7ff3’ ’define alisha-7ff9 [dynamic] (157929832)’ ’tag 32718’ ’tense present’ ’orthogonal name-1’ ’define tobe-7ff2 [dirty] (157929768)’ ’tag 1253’ ’person third’ ’tense present’ ’number singular’ ’define name-7ff3 [dirty] (157929704)’ ’tag 2336’ ’attribute alisha-7ff9’ ’attribute Root-7ff5’ ’wants propername’ ’define Root-7ff5 [dirty] (157928696)’ ’tag 157928696’ ’tense present’ ’objprep brainhat-1’ ’prep of-7ff6’ ’define of-7ff6 [dirty] (157928400)’ ’tag 6308’ ’wants things’
One can see "alisha-7ff9," above, with a "dynamic" concept designation. It is being saved okay, by the look of it. Maybe it isn’t being resurrected correctly.
Hmmm... I added some code so that the dynamic concept would be reconstituted. The issue is that there’s no provision for the labels in the compact binary form. That is, the binary save data is a column of pairs that indicate links and tag numbers. "Alisha" doesn’t exist in the symbol table, so I can’t clone a dirty copy of her from the save data. I need a mechanism to get her labels and synonyms across in the binary save file.
I kludged in something so that the dynamic concept is restored with a dynamic name. I need to go back and fix this. In general, the dynamic concept in support of a "your name is X" statement is broken anyway, and I’ll probably handle it differently in the future anyway...
So! Back to loading memes.... Cool! I compiled a whole mememap!
Something to fix: Reloading memes/weather, I don’t think that the hashes are being placed into the hash table exactly. And then there was a segv... gar.
>> do you like the weather lady? maybe. I do not know. >> do you like the weather man? no. I do not like the living weather man. >> why I loves the dead weather lady because I did love the dead weather lady. >> why do you not like the weather man? I loves the dead weather lady because I did love the dead weather lady.Segmentation fault
Looking into it some more, I think the reload is okay. It behaves the same whether from the binary form or the text form. Gotta fix segv, but am going to continue with memes. The next step is to load memes as they are compiled.
Later in the evening...
I need to look at the routines that load and tag memes. What
is going to happen is that all of the memes will be loaded,
but only the context + the current meme will be visible. Or
is it context + core + current meme... No matter. I wrote
some routines about five years ago to interleave and expunge
memes from the context and associated hashes. It is time to
revisit this code. Once I have a handle on it again, I will
reconfigure the debug-based meme shifting options so that I
can cycle through the memes by hand. Once that works...
automatic meme shifting....
June 24, 2009
I am loading memes and manually shifting. Looking for trouble.
Looking good. Now I am going to enable automatic meme shifting.
Later... I added a loop to pick through the memes using hlscore. I need to be pickier about the presence of vocabulary in a meme, since I no longer have the database. More importantly, I notice that inferences aren’t being saved (or perhaps restored) in my binary meme images. Need to look into that tomorrow....
June 25, 2009
I am enjoying the last days of Halestar. I learned more lessons about friends. Learning can be fun! I’m told...
June 27, 2009
Been off the job a couple of days because of beer, women and Halestar. Hoping to race tomorrow. Mary Kay was to be part of the crew, but she exploded tonight when I said some harmless yet complimentary thing about another woman... Sheesh.
I am still looking for why inferences don’t get loaded on a binary restore. There’s beer in me, so I am not sure how effective I will be tonight. Though I was pretty good with the piano a moment ago.
This is a little inconvenient: if I do a "brainhat -i input -S output.mbin", nothing gets recorded in the output on exit. It’s probably for a good reason. Just annoying... If I do a "-r" it works.
Here’s part of the mystery.... This is from "If I like you then you like me." The two clauses are in the subjunctive (no-object-context)), so they don’t end up in the context by themselves. The inference is missing completely... Gotta find that.
define I-7fb6 [2760] (8a47660) label I label Me label brainhat label brain hat label yourself label system label u label you label brainhat-1 orthogonal human-1 person first number singular child-of human-1 define to like-7fb4 [9116] (8a47608) label to like label to-like subform liked-1 subform like-2 subform like-1 subform likes-1 child-of action define You-7fb2 [2741] (8a475b8) label You label caller label speaker label me label i label speaker-1 orthogonal human-1 number singular person second child-of human-1 define Root-7fb0 [999967311] (8a47558) label Root-7fb0 person second number singular tense present object I-7fb6 verb to like-7fb4 subject You-7fb2 auxtag no-object-context define You-7fae [2741] (8a47508) label You label caller label speaker label me label i label speaker-1 orthogonal human-1 number singular person second child-of human-1 define to like-7fac [9116] (8a474b0) label to like label to-like subform liked-1 subform like-2 subform like-1 subform likes-1 child-of action define I-7faa [2760] (8a47460) label I label Me label brainhat label brain hat label yourself label system label u label you label brainhat-1 orthogonal human-1 person first number singular child-of human-1 define Root-7fa8 [999967319] (8a47400) label Root-7fa8 person first number singular tense present object You-7fae verb to like-7fac subject I-7faa auxtag no-object-context
I verified that the inference wasn’t saved.
Now, sleep.
June 28, 2009
Good morning.
At the time that savestatebin is shipping concepts out to file, the read/write flag for inferences is set to FALSE. Accordingly, the inference isn’t being saved. This sounds wrong to me--I think that the flag should be set. But I need to investigate.
Setting the ’rw’ flag for the inference template inside addinfctxt makes the save and restore work. Question is, what breaks. I am going to run the QA tests to see. Ran. Need to look over the results... passable.
The processing of names is causing more hiccups....
dowd@netbook:~/brainhat$ ./brainhat -m +verbatim +repeat Brainhat, Copyright (C) 1990-2009, Kevin Dowd, Version 3.090615. >> what is your name my name.meme ’weather’, score 1000000 meme ’core’, score 1000000 >> hi hello.meme ’weather’, score 1000 meme ’core’, score 1000 >> hi hello.Segmentation fault
It’s happening with memes turned on only, so far. The meme map I am testing with shifts into ’weather’. Maybe it’s that meme. Trying... Nope. Something is getting screwed up in meme shifting, methinks. Turning off memory management.... Yep! Something isn’t getting saved. I will bisect routines, looking for the culprit, as soon as I get though meme shifting.
Hmmm... Looking at the scoring in hlscore. Three flags are set: NOCOMMIT, NOINFERENCE and NOOUTPUT. If I ask "do you like food?", which should score big in one of the memes, I get a high score in all of the memes. Subsequently, I get an answer of "yes. I do like food.", though I cannot see it in the context. If I set NOCOMMIT before the program starts, on the command line, things seem to behave--only one meme scores high and the answer is still "maybe" after the test. I can see that NOCOMMIT is set twice when it’s behaving... A little mystery with multiple parts....
Gar! There’s two flags: MEMERATE and MEMETEST. What’s the difference? This wasn’t my code...
Things are funky. I found the basic pool is full of dirty brainhats! My restore wasn’t as well executed as my save, apparently. Anyway, this explains some of the funky behavior with evaluation of memes via hlscore.
June 29, 2009
It’s clear that I am making a copy of every dirty concept in the input stream each time it appears, even if it’s a duplicate. Bad programmer. Bad. I’ll fix it tomorrow if there’s anything left of me after laying off all my employees. I need to keep track of the concepts I have seen and reuse them on restore... just as I do when I save them.
July 1, 2009
I have a stolen moment.
I found the part of restorestatebin where I need to be careful not to restore things more than once.
I also found the routine, addtocontext_h, that is adding the restored symbols to the clean symbol table. Why did I do this? Perhaps it was for symbols that weren’t yet in the symbol table; the database version loaded symbols as needed. I think that addsymtabNoDB (gonna rename that back to get rid of the ransom note letters) wouldn’t add something twice.... Checking... The answer is in notes seven years ago, June 16 2002. Can you believe I’ve been working on this for that long!? Longer, even! Anyway, I think it has to do with the database and that I can skip the call to addtocontext_h since we are using one symbol table at present.
That fixed a lot of problems. Now I have to go back to restorestatebin and get rid of redundant copies of the same concept.
July 2, 2009
Done. I am looking at the output from hlscore. It looks promising. But I am seeing a more fundamental problem with hashes not retrieving stuff all the time. Specifically, I can tell brainhat "You like food" and it will recall it okay (is it just looking at the beginning of the context?) However, when it’s buried in meme "like5", it doesn’t answer correctly. I am going to go back to a month ago and see if this is something I broke recently....
>> what do you like? I do not know. do You see the sun? >> do you like food? yes. I do like food. do You see the sun? >>
It’s just lameness on my part. The hash can’t locate "I do like food" when given "what do you like?" I need to think about this. In the interim, I will finish up meme-shifting.
July 3, 2009
Some lofty thoughts.... I wonder if I can use a captured sequence of hash values in some way to create a Baysian predictor of where the conversation will go. I wonder if I could mine this from existing corpi.
I put meme shifting back in! I works great! Now I’m all jazzed up!
Short term improvements to be made:
• I need random meme shifting put back in.
• I need to be able to shift on a vocabulary match. I can think of several ways to do this, including collecting a list of words for each meme and compile time and tacking it on the end of the binary (tags, of course.... not the actual words).
• I might want an occasional broad sweep of all the available memes to detect if the conversation has changed so much that I will never shift my way out of where I am. Simulated annealing...
I put random shifting on timeouts back in. It really is pretty fun....
July 5, 2009
Cleaning up stuff. Eliminating some mprocs. I got rid of assignt, a routine that used to save away a partially completed match. I haven’t used that in some time...
I left brandnew behind because it was interesting. This was a routine I used when I was working with the VRSim. It creates a new dirty copy of a concept based on discovery of the attribute "new" in the input.
I deleted bestfit; it is for picking through multiple interpretations of a speech engine. Likewise for firstfit. I like them. I used to use them with... the Nuance speech engine. I will remember them, should the need recur.
Getting rid of dial, used for making phone calls.
Getting rid of winampcommand.
Routines dummytime and dummyadvtime sorta look like kludges. They were for asking questions about time. I need more general processing for that...
I haven’t used eventpronouns for many years; I sadly retire it for the time-being.
Routine gettime is ugly and unused. But I may need it. So I left it.
Routine graph is empty. But it is a good spot for inserting code to generate graphs of CCs like we used to with jgraph. I left it there.
Routine iamat was part of peoplething. Deleted. Got rid of ptactionuser and ptname as well.
I want to look at beautifying main.c. I need to do something about some of the globals. And I’d like to hide the ugliness with "speechbuf" assignments.
Goodnight.
July 6, 2009
More cleanup....
Dead: objectroot, rootobj, score (was used in support of SAPI).
I need checkcommand cleaned up and promoted to a directory of its own, the way "speak" and "ponder" are. The directory should be called "imperative."
Next project... should I re-add vocabulary to the meme selection algorithms? Or should I tackle the bigger question: the hash can’t locate "I do like food" when given "what do you like?" I also need meme selection to grope around a little more.... I don’t like having dead air in a conversation. Maybe random meme shifting should give up if it finds no questions to ask and say "i have nothing to say."
July 7, 2009
Halestar is pretty much over. I bought Deborah’s shares yesterday for a little bit of money. I am turning the operation over to Phil, et al. Time to work on Brainhat and look for a job.
July 10, 2009
I posted some code to the web site.
I am looking into the issue regarding questions such as "what do you like?" These are resolved okay if the answer is near the front of the context, but don’t get resolved by hashing if they’re elsewhere. The equivalent question is "do you like something?" Does that work reliably? Making experiments. Hmmm... Sorta behaving...
question-what-2 matches "what do you like." question-what-3 matches "do you like what?" They both use declques2. Looking at it, it hashes on the exact things one requests. If that fails, then it crawls backward through the context (10 deep) looking for candidate answers, matched against a CC pattern with the appropriate parents.
define declques2 rule $l‘SUBJECT{&C^0^0}‘3$l‘VERB{&C^1^1}‘4$l‘OBJECT{&Z^2^2}‘5
So, I think that the correct fix is to add hashing for parents to declques2. I can steal the loop from tests.c. Then, I don’t see the point of walking through the context anymore. Though, I might be too much of a sissy to cut it out.
Hmmm... I’d like these tests to work across memes, in hlscore. I need to think about that.
July 11, 2009
I’m running off half-cocked, and rediscovering what I already knew. I am in the middle of fitting declques with an iteration spce like the one in tests.c so that I can use the hash tables to tell me if there’s a more general case of the specific question out in the context somewhere--too far into the context for me to crawl to.
The iteration space gathers all the parents of the interesting concepts discovered in the input and hashes each permutation. For instance, "mario sees princess" will also hash "man sees princess," "thing sees princess," "thing sees woman," etc. This is good if I am looking for a more general instance of a specific thing, which would be the case when I am trying to match an inference template to a concept, such as "if a man sees a woman..."
But, it doesn’t work the other way. I can’t test all the child concepts of a given concept to see if there’s a matching hash because of the nature of a taxonomy--it gets much larger as one descends the tree. That’s why I can’t have, for instance, "what does mario see?" answered by a hash (the current scheme, anyway). It would be the equivalent of hashing "mario sees thing," "mario sees ball," "mario sees luigi..." all the children of "things." In other words, it is computationally dangerous.
So, the question is, how do I accomplish this? One idea that strikes me now is that the vocabulary/taxonomy might be dynamically constructed of only those concepts that have actally been invoked in discourse. Or, perhaps a working taxonomy could be constructed alongside the full taxonomy. Then, possibly, I could safely search all the child links. And I’d know that only relevant concepts would live within it.
Maybe it needn’t be a tree at all. I could simply pick through the dirty symbol table looking for children of whatever. That’s awfully interesting....
(later) I have started to work on this--making a new routine called getallchildren in conrtns. But I am asking myself about statements-as-objects in the subjunctive. I won’t find these because they aren’t in the symbol table. If I do find a way to keep track of them, then I should be able to hash into stuff like "what does mario know?" These statements are immediate children of things where they do exist. Disturbingly, all the subjunctive clauses ever generated will be up for consideration as the "what" in "what does mario know?"
And if I did offer up all the subjunctive clauses somehow, could I also hash in to find the answer to "does mario know that the princess is pretty?" I would need to match a hash for "the princess is pretty" in order to get this right.
>> mario knows that the princess is pretty mario knows the princess is beautiful. >> what does mario know mario knows the princess is beautiful. >> what does mario see I do not know. >>
It works now. The pattern that matches "what does mario know?" is question-what-2.
>> mario knows that the princess is pretty mario knows the princess is beautiful. >> does mario know that the princess is pretty? yes. mario knows the princess is beautiful. >> does mario know that the princess is old maybe. I do not know. >>
The pattern that matches "does mario know that the princess is pretty?" is question-action-1. This pattern and the one above use declques2 to provide the answer. The workhorse is vrfy_achild in rels.c.
I am trying to decide whether a hash is the right answer. Each subjunctive clause is a child of things. Each subjunctive clause also has its own hash. The former is necessary to answer "what" or "who" questions with a hash. The latter would apply to more specific inquiries.
I could punt and have declques2 crawl through the whole context (it goes 10-deep at present). That’s tantamount to a serial search. That’s ugly, but at least the "mario knows ..." portion of the comparisons (like I have now) would limit the computation. I have to think about all of this.
July 12, 2009
The subjunctive almost never appears in the subject slot. The only examples I can think of are things like "The princess is in love with mario is ridiculous." I don’t even care to provide grammar for something like that.
So, searching for ways to answer "what does mario know?," or "does mario know that the princess is pretty?" might be an exercise in partially hashing--just the subject and verb infinitive. But still, the subject might not be mario? Consider the following:
>> mario knows that the princess is pretty mario knows the princess is beautiful. >> does a man know that a woman is pretty? yes. mario knows the princess is beautiful.
It works so well with vrfy_achild and a serial search!
Perhaps I could do a set of partial hashes with {mario, knows}, {man, knows}, {man, action}, {mammal, knows}, etc., and then return the list of candidates to vrfy_achild (called from vscore) for the answer. And, perhaps I could revert to it only after the serial search (10-deep or so) has failed. Short of coming up with some amazing taxonomic hash algorithm, this sounds like a plan.
The next question is: do I want to do this now? It will improve things, but it might not be critical path.
Punt! I increased the depth to 30.
(later) I am finding that there’s something not being saved or in the binary saving of memes. Particularly, "I am sad because the weather lady is dead" is getting busted into "I am sad. She is dead." Doing some research.... Looks like CAUSE and EFFECT aren’t being saved. Looking... Appears that the cause-and-effect CC isn’t marked as dirty.... Looking into it. Hmmm.... create_rootcon doesn’t automatically set the RW flag. How do other root concepts get sullied? Gar! I could just set the RW flag to dirty, but I want to know how come it happens to other dirty roots without a problem.
Hmmm... I added a line to addinftoctxt to make it dirty just a few weeks ago, when i was fixing the binary save routines. I could do the same for cause-and-effect and the world would be right again. But I am still perplexed how the other CCs are getting set correctly.
(Also, I have another problem to look into: CREDITSPKR seems to be getting set during processing after I’ve unset it.)
"mario likes luigi" is arriving at addtocontext as "justcloned." Now I get it. It’s a fortunate accident that these things are making their way into the context. The shaped and reshaped CCs are clones of the original match. That’s how they are getting the designation "just cloned." I would like to think about it some more, but it’s not a problem that’s going to give me agita later.... as long as I understand it. Anyway, it means I can add an mproc to dirty the cause-and-effect CC. I might also modify addinfctxt to not dirty its CC, but rather have the new routine do it. The new routine will be called makedirty. Done.
Brainhat, Copyright (C) 1990-2009, Kevin Dowd, Version 3.090710. >> i want to have sex with you You want to have sex.Segmentation fault dowd@netbook:~/brainhat/brainhat$ ./brainhat -m +repeat Brainhat, Copyright (C) 1990-2009, Kevin Dowd, Version 3.090710. >> i want to have sex with you You want to have sex.Segmentation fault dowd@netbook:~/brainhat/brainhat$ ./brainhat -m +repeat +reap Brainhat, Copyright (C) 1990-2009, Kevin Dowd, Version 3.090710. >> i want to have sex with you You want to have sex. >> what do i want You want to have sex. >> what is sex sex is fun.
Got a memory problem. I sure was broadsided by that last response though! Funny.
Fixed.
July 13, 2009
I need to assemble a new set of demo memes. And I need to get a Windows version prepared. Then I’ll go back to work on the fundamentals.
July 14, 2009
How about this: "goal1 is mario wants something" which means "if mario wants something then goal1 is met" "if goal1 is met..."
Hmmm... I can do the latter forms with the existing grammar. I just need to add the vocabulary. I also need for speak to not voice anything about goals being met unless a variable (say "goals"?) is set.
Starting a new set of memes for demonstration purposes and (of course) am finding things wrong. First, I have to amend the binary save to make copies of dynamic concepts. Otherwise "My name is Janet" and "I am 25" won’t come across correctly. Second, I need to look into why this doesn’t fire an inference: "If I ask how you are then ask how am i." This works: "if i say hello then ask how am i."
>> how are you I... I do not know. >> do i ask how are you? yes. You do ask are I goodness_attrs. >> ^C
Gar! It was ’idiom’ processing. Gotta think about that later... Needs to get fixed or tossed. Back on track.
Gar! Bug: if I put any comments in the meme input, the first statement gets ignored. I need a bug tracking system.
Here’s the difference between the saved and restored version of the dynamic concepts that held the name in "your name is fred."
define fred-8015 [32717] (9bd3f78) {dynamic} label fred tense present orthogonal name-1 child-of adjective define dynamic-7fa9 [32681] (9d3f270) {dynamic} label dynamic-7fa9 label dynamic orthogonal name-1 tense present
It’s apparent that I need to save child-of links for the dynamic concepts. I am done for the night, though.
July 15, 2009
Hmmm.... No CHILD-OF tags are being saved, it seems.... because they’re not part of LOCAL_SET. Looking into it. Found RESTORE_LINK_SET, which seems closer. Trying it out.
Getting closer. The restored dynamic symbol for "my name is fred" isn’t making it into the symbol table anywhere. I think I need to add that by hand. Also, I am not sure about the hash.
And I have this issue too:
dowd@netbook:~/brainhat/brainhat$ ./brainhat +repeat Brainhat, Copyright (C) 1990-2009, Kevin Dowd, Version 3.090714. >> my name is fred your name belonging to You is fred. >> what is name belonging to me your fred name is name-or-address.Segmentation fault
It is memory related! I’ll be back later... trying to get rid of Halestar, still.
July 16, 2009
Still working on binary restore... I wasn’t getting symbols into the symbol tables. Memes should work better now. I have to work on names, though. They’re just a kludge still. "My name is fred."... I think it used to modify speaker-1, but it doesn’t anymore. Should it?
Gar! More trouble. I am saving multiple copies of brainhat-1 and other stuff. They are the same concept, updated and cloned, so they live in different memory locations--which causes me to treat them as separate on the save. In a test a moment ago, there were 14 copies of brainhat-81b6 saved.
Ideally, the context would get updated in Brainhat so that this couldn’t happen. It would help conserve memory too. But it isn’t fatal for there to be clones floating about. Anyway, to fix the issue with the save in the short term, I could keep track of not whether I have seen the same memory location before, but whether I have seen the label. The first duplicate saved to the binary file would be the winner, which would be consistent with the last-into-the-context being most up-to-date. Alternatively, I could take the label and look up the symbol in the symbol table... and save that.
I do need to create a better ponder routine for consolidating the clones. One exists. Maybe it’s broked.
It’s all getting better, but this is giving me agita now:
define Root-9d60 [999959711] (90f2da0) {justcloned} label imp-quote-1 auxtag no-object-context object Root-9d3e tense imperative subject brainhat-81b6 verb totell-1 indobj speaker-1 define no-object-context [2622] (8a26ea8) {clean} label no-object-context define Root-9d3e [999959745] (90d00e0) {justcloned} label sent-action-10 subject brainhat-1 verb todesire-1 object Root-9d23 tense present number singular person first define brainhat-1 [2760] (90f2320) {justcloned} label I label Me label brainhat label brain hat label yourself label system label u label you label brainhat-1 orthogonal human-1 person first number singular child-of human-1
This is a snippet taken from "if You do ask how is the weather man then tell You that I want lightning hit the weather man." The copy of brainhat-1 in the inference template is clean, which is good. But it’s marked as just_cloned, which isn’t so good. Wondering why... considering that if I went with plan B, above, I’d get the correct RW flag from the symbol table....
It’s looking brighter. The extra dirty symbols in the inference template (and any subjunctive clause) may not be clean. Consider the statement "i know that the red princess is ugly." There is no red princess in the context--as there shouldn’t be. However, the CC created contains a dirty copy of "princess" with the attribute "red" attached. AUXTAG no-object-context keeps addtocontext from adding the components to the context symbol tables. I was ignoring it on restore. My bad.
They’re dirty concepts, though. They’re just not labeled differently from the clean copies. This is a problem in a binary save; I need to be able to identify these dirty copies and a) copy them and b) not add them to the context symbol tables upon restore. I have the aux tag on the restore, so I should be able to observe and obey that. In processing/on the save, I need to either re-label these concepts so I don’t intermix and confuse them, or find a different way to identify them. I could end up with two different princesses in inference templates, each bearing the same label. Using the concept memory address already failed.... Looking them up in the symbol table won’t help either, since their labels are identical to the virgin/clean copies. Must use brain.
I think I want to re-label them in processing in any case. I don’t want dirty concepts looking like clean ones. Already tripped me up once today.
July 21, 2009
Trying this:
• Going to go back to looking at concept memory locations (not label locations) to decide if I’ve seen a concept already or not during the save.
• Going to observe no-object-context on restore.
• Going to look into processing inside of ponder to see if I could do a better job of consolidating concepts in the context so that there are fewer involved in the restore.
(Later) Ignore the last few comments. I was looking at ways to have addtocontext restore the symbols for me, and I came across addtocontext_h commented out a few weeks ago. I didn’t remember what addtocontext_h was trying to do.... it appeared to be adding dirty symbols from the memes to the clean symbol table. I thought "That’s just wrong. I want to add symbols to the dirty symbol table."
Thinking this through... the stuff coming back from the stored memes will have its own copies of many concepts, such as ’brainhat’ and context symbol table. Beside that, the restored copies may have slightly different qualities, eg. "happy brainhat" in one meme and "sad brainhat" in another.
The idea of a third symbol table for all memes makes sense to me at this moment: 1) one for clean symbols, from the vocabulary, 2) one for dirty symbols, generated in computation, and 3) one for dirty symbols associated with the current meme. These symbols will have been dirty in the meme when it was stored--made available again when the meme gets focus. I must have had facilities for multiple symbol tables... Gotta go look.
Yep... Used to have meme-specific symbol tables. But they were used differently because of the presence of a database.
I am going to add meme-specific symbol tables and modify the symbol table search routines to look for these symbols.
(dinner time) I cleaned up nsymtab.c considerably. I deleted all the database variations of the original routines and added one routine called initmemesymtabs for meme-based versions of csyms and ctext.
To do:
• modify the meme structure to contain two new stab nodes
• call the new routine the first time a meme is brought in to init the stabs
• use the new stabs from addcctocontext
• add code so that find.c, debug and others look in the meme symbol table as well as the context and clean symbol tables.
• test
July 22, 2009
Above list is done. Still, there are too many symbols on restore. And the code behaves differently with CREDITSPKR turned on. This is because it inserts no-object-context to protect the quote’s symbols from being added to the symbol table. Really, the thing being said and the fact that the speaker said it should be shuffled out to context separately...
Anyway, i am grieving over the number of symbols I am getting back on a restore. And, as before, it appears to have to do with clones... Depending on no-object-context, as I suggested earlier, is a weak way of handling the problem, since it isn’t universally applied; it really has to do with spkrsays at this point.
The fact that i am getting a number of duplicate symbols back also suggests that there are just too many clones floating around, anyway....
I wonder if, during addtocontext, I could reference "just cloned" concepts by their labels and replace the symbol table reference to the latest clone. Then, in theory, I could go through the context (as in updatectxt), replacing the corresponding concept references with the new references--actually modifying the CCs in the context.
A little introspection: I always thought that memory pointers were the best way to string CCs together, for efficiency. Right now, construction by reference sounds pretty good!
(a little later...) I am happy to discover that the symbol table is up to date. I think I can clean up the clones on a binary save by looking to see if a concept is dirty, going back to the symbol table and finding what the label points to. Since we’ll be using the same entry repeatedly, the test for duplicates based upon addresses should work.
July 23, 2009
Traverse a concept, looking at all non-parent links { target == the thing the link points to if target is dirty if target is not Root if we find target in dirty symbol table if we have NOT recorded target’s symbol in "seen" list already record target in "seen" list (symbol and address) else replace link with pointer destination from "seen" list if the target is Root if we have NOT recorded target’s symbol in "seen" list already record target in "seen" list (symbol and address) else replace link with pointer destination from "seen" list }
The above is an algorithm for consolidating the dirty copies of concepts before saving state. It would also be a good way to clean up the context, and ultimately garbage collect, so I am going to write it as a pre-step to state saving. It’ll be an mproc routine.
I have to go for a bike ride. There’s rain on the way and the Halestar sale is in limbo because Phil wants everything and wants me to indemnify him for it. My old acquaintance Rick is as useful as Ed McMahon.
July 24, 2009
Today, Phil threatened to sue me. Then he implores that we talk. My lawyers are saying that I shouldn’t even call him back.
August 28, 2009
Then... Phil’s company went out of business. I tried to make a deal with one of my competitors. That got complicated. I got a job offer from a company that makes mobile games. It pays too little, but it looked like a strenuous mental exercise (which is good). But now I can’t put Halestar down because it might have a chance and I really like the freedom to go down and take the sails off my boat between every friggin’ tropical storm. Hi.
August 30, 2009
Tropical storm Danny was a dud.
August 31, 2009
My algorithm for stamping out clones, above, seems to have worked very well. Well enough, in fact, that I am afraid to tax it.
Now imagine lots of little brainhats. And imagine a set of applications that ask questions--how you feel about ice cream, politics, religion, sex... I capture it all. Each subject that interests you becomes the area of interest of a little copy of brainhat. It has the same opinion and draws the same conclusions. Someone says something. The gears turns. The little brainhats argue among themselves to decide what’s interesting--the most interesting thing to you. It blurts something that you are surprised to hear. It’s what you were thinking! Smile for me. I took your picture!
September 7, 2009
Trying to find time to get back on this. I’ve been sailing a lot. Working all the time. Needy women...
September 12, 2009
I’ve just consumed a text on statistical NLP to find out why I got diss’d for the the conference in Poland this fall. It’s a very well-written book decribing the tools for extracting patterns in large corpora and applying them to new data. There was also some treatment of grammar-based NLP (like I have here), and some statistical methods.
I see some value I can apply. But I don’t see statistical NLP (HMMs and the like) used for machine translation, POS tagging, text extraction, etc. as a replacement technology. I am not simply looking to translate language input into another form. Rather, I want to compute with knowledge, and I need to explain the computations precisely to Brainhat, in terms of unambiguous language. My reaction was the same as I had studying neural networks and curve-fitting: "what do you debug if you don’t like the results?" You’ve got a matrix full of coefficients.
I’m trying not to say "not invented here" and not be pendantic, but I am going to describe what came to me in the course of reading the book. Not all of it came from the book:
• I can see the value of a preferred vocabulary on a meme basis. This would reduce ambiguity in interpretation. We used to have extra vocabulary when Rich had the database in place. I’m not looking for that, exactly. Rather, i am thinking that I would like to raise some of the vocabulary in importance inside a meme. I supposed this could be tucked into the meme map.
• An old notion--the probalistic CFG. One can use sampled data to help reduce ambiguity in word and phrase interpretation as a function of extraction from corpora.
• Use of parallel corpora for translation of complex English to simple English. Not sure how much work this will be. I downloaded a few tools. "I can see myself in a new car" -> "I want a new car" I could spend a year annotating translations.
• An old idea: use AIML or something like it for de-idiomization and language simplification.
• This idea is one that’s been tickling me for some time, but I will describe it in terms of the NLP text: using corpora to predict next utterances and more interestingly, next actions. Wouldn’t it be cool to use history of the past to predict the future? Perhaps, detecting temporal movement in text would be the way to switch memes. Maybe, after simplifying the language, text could be compiled into processes strung together by a meme map. For example, the process of walking someone through debugging a problem or taking out a policy.
I need the following sequence to work:
>> ask me what i want
what do You want? >> a ball
a ball. >> what do i want
I do not know. >>
I found a couple of AIML interpreters in shards. One that looks promising is RebeccaAIML, though it’s little weighty and with a lot of dependencies. I have been compiling Xerces XML library for the last 15 minutes. Rebecca also needed ’boost,’ a library of C++ headers for various stuff. And now the boost file system. I’m gonna turn blue... 15 minutes later.... 10 more minutes...
How did code get so big? I just got a Color Computer from a fellow on eBay. I have been running Unatron for the first time in 20 years. It’s gotta be 20K, tops.
Looking at OpenCyc. I wonder why it always seemed impentrable in the past. I guess I never found the tutorial. Looks good!
My God: after all the compiling of boost and Xerces, compilation of Rebecca failed immediately. Shit. The problem appears to be too many definitions of ’exception.’ Fixing it friggin’ everywhere. Made it ’std::exception’.
If this thing works, it’s going to be on the other end of a TCP connection. I can’t imagine linking this into Brainhat. I’m hoping I get an AIML interpreter out of it, but it feels like I am joining a church.
Member function SetDoValidation doesn’t exist in SAX. Commenting out references in GraphBuilderAIML.cpp
Gar! Got through the main trunk. Something just shot out a spring in ’console.’ Gotta go eat and get pickled. Enjoying my day, though. Back later.
It runs... The path to the AIML is hardcoded in /usr/local/src/Rebecca.../build_files/autoconf/lib/Makefile.in. There’s config files, but they are ignored. The setting is aimldir = ${datadir}/Rebecca/aiml/annotated_alice.
September 14, 2009
It works. I have been jamming idiomatic language into it. I am going to write the glue to connect it to Brainhat tonight and then move onto some other thing. I will add translations as they become relevant.
September 19, 2009
Playing with Moses. Statistical NLP may be to generist NLP what string theory is to...
September 23, 2009
It would be nice if AIML had some concept of taxonomy. It’s pretty dumba. I have been looking at ways to create an English simplifier front-end other ways, such as through statistical NLP. It’s pretty ugly. AIML is dumb. I might write a preprocessor using Brainhat routines. Perhaps, if it came up with multiple interpretations, it could offer them all. I’ve already started looking at cyc as a way to help reduce ambiguity in input.
Other thoughts: at some point I am going to make the last open release for a time, and work on commercial products. Atlantic Computing? Halestar is dead. Internet security and wireless is like washing machine sales and repair. I’m on my way back from VB now. Clients can’t be bothered to call back about appointments. I had a nice trip down. Last trip down, perhaps. Need a job, maybe.
I was thinking about web site updates, too. It’s time to make it look cooler. I need to stop calling it NLP, and start calling it computing with knowledge.
September 25, 2009
>> hi hello. are You a man? >> yes You are a man.
This was supposed to cause a meme shift. Look into it. Also, let’s call them something better than "memes."
Line 213 in tests.c has a reference to selftalk commented out in favor of speakout. I think if selftalk were used, an inference would had fired. I am wondering if there was too much going into the context... Guess I could uncomment and see what happens....
That wasn’t it. speakout and selftalk are practically the same thing. The issue is that each calls match_first with its input, so there’s no searching to see if another meme would be a better fit. I could modify them to call inputcycle or add a ponder routine that looks into other memes after the fact. Gotta get ready for a party now, though.
September 29, 2009
Lofty ideas often have to yield to basics... I wrote a tool to turn the test results into an HTML page. It’s very handy, and is making me go fix all the patterns. Of course, I hit an iceberg immediately.
In sent-action-9 there is a trailing optional attribute so that the pattern can match stuff like "mario is seeing the princess in the water" in addition to "mario is seeing the princess."
$r‘csubobj-ana‘0! $c‘tobe‘1! $r‘actions‘2! $r‘csubobj‘3![ $r‘ccattr‘4]
The pattern works well by itself. In a more complicated input, where this pattern is a sub-pattern, the trailing optional attribute is a problem if the trailing attribute isn’t in the input. The reason appears to be that the " " is being consumed by the pattern before it fails, and not being returned to the input buffer. I need to go on a witch hunt. If I find and fix this, it may have a profound effect on a lot of other patterns.
I fixed it! It was in lookfor() in find.c. I haven’t made a change to lookfor() in 19 years! Better yet, none of the tests broke. And the one that was broken now works! Woohoo!
Different issue:
>> kevin’s number is 6 kevin’s number belonging to kevin is 6. >> do i say that kevin’s number is 6? yes. You say kevin’s number belonging to kevin is 6. >> do i say that kevin’s number is 4? maybe. I do not know.Segmentation fault
This doesn’t appear to be related to memory.
Later...
define cattr-time label cattr map PREP,OBJPREP rule $p‘at-1‘0![$c‘preposition‘0! ]$r‘cattr-time2‘1 define cattr-time2 map ROOT,ATTRIBUTE rule $r‘cattr-number‘1! $c‘time-1‘0
The attribute rules above caused the segv. What’s $p do anyway? I don’t recall. It can’t have many road miles anyway...
October 3, 2009
Working on tests some more.
Better look into this:
>> what is kevin’s address/ kevin is male.Segmentation fault
It works without the typo.
I am back to looking at questions of the sort "what kind of pet do you have?" The "kind of" part, and its analogs, are asking more than "do you have pet?," which is what Brainhat is currently answering. I need an mproc that trolls the for children of the indirect object in the discourse. The problem is that brainhat asks "do you have a pet?" and I answer yes. It doesn’t go on to ask "what kind of pet do you have?" because the internal answer given back to selftalk is "you have a pet." It should come back "I don’t know."
I also need to tag this simple-statement so that when it comes time to voice it, the output routines will say "what kind of..." in lieu of "do you have...?"
I’m on it.
It’s also time to implement "if you [don’t] know...." so that brainhat programming can stay stuff like "if you know what kind of pet i have..."
It’s looking like the best way to implement this search for children of ’pet’ is to add some processing to qacands and qacandsall. I am sure I will be able to make use of it in other places in the future. The idea is that, based on the setting of a flag, qacands{all} will return a chain that explicitly doesn’t include the parent. So, if a match for child of ’pet’ is requested, the result will not contain ’pet,’ even if it would otherwise come up empty.
The name of the flag will be QACHILDONLY.
Wow... this is powerful stuff. Routines vrfy_achild and vrfy_xchild are serious pieces of work. They will tell when one CC is the child of another. But, it will take some though to make them say whether on CC is the same as another. That is, I’d have to screw up some elegant code. I need to do it, though. In addition to "what pet do you have?", I have to be able to answer questions like "what kind of pig in a poke do you have?"
The question is: should I put another routine in Irels.c that answers "this is the same as that?"
I took a little nap, and the idea came to me. It’s kinda obvious: if thing A is a child of thing B, and thing B is a child of thing A, then thing A is thing B. I’ll add that and my new flag to qacands{all}, and I should be good to go!
October 4, 2009
Making changes in the qacands’s (which occurs to me again meant "question and answer candidates.") The flag can be set right inside the pattern, within input-patterns--no mproc necessary.
Something to look into: in conrtns (line 520), trying to print dbg[tmppr->value] for an auxtag cause a segv.
Fixed. Now I have a problem with segv’s that appears to be memory related.
Gar! Memory problems. I am having trouble making sense of them. It all started when I added various forms of "to eat" to the vocabulary. I’ve gone through a divide and conquer exercise with memory management, but the problem moves around without determinism. I am going back to look at "to eat" again. I can’t forget that I was in the middle of implementing some new grammar for "what kind of pet do you have?"
Gar!
October 6, 2009
I’ve been searching for this thing for days. It was clear that 1) saving everything automatically fixed the problem and 2) that something wasn’t being saved otherwise, and memory was getting corrupted.
I tried some stuff in gdb, but it was heartbreak; it is not possible to watch a memry location that hasn’t been malloc’d yet. So, I decided to put another structure member into concepts, intlinks, etc.... one that marked whether the structure has been saved. I then expected to be able to dump CCs and see where the trouble might be.
What happened instead is that the segvs went away. Grrr.. For now.
October 7, 2009
I want to finish the grammar I was working on before the memory problems. When I get back, I want to investigate whether saving concepts partway through the process might not be at the root of the problem. Consider that if I am plugging leaks by saving concepts before the very end, new links added to the concepts after the save are not being explicitly saved.
Hmmm... I could test this theory by saving links from a concept even though the concept appears to have been saved already. Better yet, I could report that I found such an occurance. Gonna do that.
Hmmm... Shows nothing.
No, wait. I am finding a few things... But I’ve had the boys here the last few days and I am so tired I don’t even know what I am debugging anymore. I have to look at the memory thing all over again, methinks.
October 8, 2009
Argh. I am going to do memory over again. No more splice/unsplice or spot saves.
1) During processing: string structs together as they are malloc’d. They don’t even have to be the same type.
2) At the very end, go through and mark the saved flag on everything we’re going to
3) ’reap’ by running through all the strung together structs, ’free’ing with abandon.
But! Discipline. I need to finish the grammar I was working on before I wrestle this daemon.
October 18, 2009
On October 3rd, I solved a problem I was having with answer questions like "what kind of pet do you have?" It’s easiest to explain by example:
"you have a pet. what kind of pet do you have?"
Brainhat (qacands) would search and find "pet" as the answer to the question, ala:
"I have a pet."
This answer isn’t what I wanted. The answer I want is "I don’t know," because the intent is always for a more specific answer to the question. On the other hand,
"you have a dog. what kind
of pet do you have"
"i have a dog."
would be correct.
I solved the problem by adding a flag and some processing to qacands and qacandsall to return only proper children of the token passed to it. So, if qacands was handed "pet," and the context had "dog", "cat" and "pet" in it, only "dog" and "cat" would be returned. If "pet" was the only child in the context, qacands would return empty.
My fix didn’t pass the next few tests, though. First, the pattern (question-what-7) returns a CC with no OBJECT attached if qacands came up empty, as in the question "what kind of pet does luigi have?":
define Root-80e3 [999967004] (a185a80) clean label question-what-7 wants third wants singular wants present enable does-1 verb tohave-1 subject luigi-1
The output comes from utter_dunno because the resulting CC is just plain broken. Fortunately the answer "I don’t know," is the correct one.
The bigger issue is that there’s no way to recover the original question once qacands has thrown everything else over board. I want to be able to say "do i ask what kind of pet do you have?," or "if i ask what kind of dog do you have then i want to eat your dog."
I think that, technically, the use of a hypothetical object, such as "pet", is an example of the subjunctive. There might be no pet. If I want to record that there was a question about nothing, then I have to tuck it away before qacands gets it and reduces it to a question about the world as known from the context. And I am wondering where else I need to be thinking about the same issues.
rule what [kind of ]![type of ]$r‘csubobj-prep-qchildonly‘2! $c‘enablers‘3! $r‘csubobj-q‘0! [$c‘not‘4! ]$r‘actions‘1![?] map SUBJECT,VERB,OBJECT,ENABLE,WANTS
Where we see sub-pattern csubobj-prep-qchildonly, I could instead invoke csubobj-prep with NOQA set. Then, quotques could run to capture the question. Unset NOQA. Then, invoke a new mproc that expands the object with csubobj-prep-qchildonly, creating a string of candidate CCs. The mproc will not return a CC with a missing OBJECT. It may return nothing at all. What happens then? Hmmm...
Gar! I need to spend more time with the code. If I change the object match to csubobj-prep, qacands doesn’t run. And even then, the question still works because I have hashing in place. In fact, I MIGHT NOT EVEN NEED QACANDS anymore.
Better yet, with csubobj-prep in the rule instead of csubobj-prep-qchildonly (or csubobj-prep-q), I get "pet" back in the object spot. That should mean that the question is recorded correctly. It is!
So, I need a way to keep declques from using the exact match ("pet"). That code is in qacands and qacandsall, and I need to rip it back out and relocate it into declques2.
Later.... done. Works. I left the code in qacands.
October 24, 2009
There’s so much to do. I could go back and fix memory, but I want to look at something I looked at once in the past. Brainhat says "what do you like?" The answer can be "i like a honda," but "a honda" doesn’t work. The input "a honda" gets caught by tell-all-csubobj, which spits "a honda" back out....
To make it so that "a honda" can be taken as the answer to a pending question, I’d need to first check to see if there is a saved declarative question willing to take the proposed answer (a honda) as the child of the object in the question. This would also be the place to pass the proposed answer over to Cyc to see if it makes sense.
If there isn’t a pending declarative sentence, or the answer doesn’t make sense, then the behavior could fall back to tell-all-csubobj.
Before I do anything, I need to look through the bones to see what I did in the past.
Later... didn’t find anything. Found commitadj, which is the routine that adds "red" to "ball" after "ask me what color is the ball", "red". At least it will suggest a process...
November 3, 2009
Hi.
So, step 0: see that the
question is saved.
step 1: if there is, make the "thing" the answer
to the question by self-talking it, and record that the
speaker said that "thing" was the answer to the
question.
I can go add Cyc later. What if Verbatim is off? Then I only want to record the fact that the speaker said it.
step 2: Whether step 1 fails or not, fail through to the code that is currently tell-all-csubobj.
November 5, 2009
I’m going to have to do with questions about lineage too. Like "what are you?".
Anyway, the question is saved in checkcommand when one says something like "ask me what color is the ball" What about something like "ask me what do i like"? Yes... saved in the same spot.
Now, I am going to put a pattern in front of tell-all-csubobj. It will catch the ’thing’ and see if it can be substituted into the place of the OBJECT.
January 16, 2010
As has happened before, other events interrupted development, including a returning work load (that’s good!), some effort with WiFi public address systems and an ongoing distraction with video over IP networks.
And, as also happens, other Brainhat ideas overtake ongoing efforts. In this case, I have been working on generally improving Brainhat, so the effort will be going on ad infinitum. Right now, I’m thinking about a couple of other interesting efforts, and also thinking about the web pages.
First, the web pages: the main page is a rant. I need to replace it with a more general thing and retire the rant to a second-level page. And, as ever, I need a fresh windows version up there and some online demonstration.
Here’s whacky idea number one: I’ve been playing with video compression. Rick’s been at it too. The popular compression algorithms at this time include mpeg4 and mpeg4 part 10 (h.264). These are elaborate algorithms to take the existing scene and turn it into an easily transmitted watercolor, where the whole picture is captured once in a while, followed by spot updates captured as spatio-temporal changes. It’s really the lowest possible level for capturing and reconstructing an image--pixel and motion approximations. But, it is very lossy--the reconstructed image may be nothing like the original, depending on the resources (not) committed.
A yet to be correlated separate thought: I had had a discussion with a couple of people at Blue Sky studio once, suggesting that wouldn’t it be an interesting idea to be the owners/renters of a set for others’ use in rendered movies. Imagine, for instance, that you might have a model of New York that others could borrow as a backdrop for their digitally-rendered movie.
The third sub-thought: imagine that you ask someone what their recollecion is for a movie scene shot in a city. They’ll recall the actions of the agents (the actors) and envision the backdrop. The backdrop is almost irrelevant unless the agents interact with it, and in many cases, they are interacting with a portion only (say, a taxi cab). That is, a person’s recollection of a movie scene will be guided by what appears on the screen, but it won’t be semantically richer than what they would have conjured in their heads if they’d intently read a book on the same theme.
Take the last three paragraphs together and consider that there’s a another kind of scene compression that’s been with us since the invention of language: it’s semantic compression, in which the relevant details of an event are relayed and reconstructed in the mind of the listener. What difference do the backgrounds make as long as they are relevant? And then, think about the possibility that if scenes were reconstructed symantically, they could be also tailored. The backdrop could be any city in the world.
Whether it’s a meritorious notion or not, it would make an interesting demo.
The next batch of thoughts is a little more un-ridiculous. I’ve been wondering about ways to help make Brainhat’s meandering through a conversation and it’s expectations about the way a conversation would proceed more closely match peoples’ soft expectations. I can program memes and make the program goal-driven at a scripted level, but how do I go about making it behave and think more like a person? At the same time, how do I make it grammarless and grammarful at the same time?
I (believe I) need the grammar so that I can parse stuff into structures that I can manipulate. At the same time, i need to be able to handle mispellings, missing words and agrammatical constructions. The facilities I have now are:
•pre-processing to
reorgnize input into parsable forms
•overloaded vocabulary to include misspellings
•sloppy input pattern templates
What is I added a few more bells and whistles? Take some input with problems, e.g.: "I wnat yo"
There are two mispellings and possibly a different latent meaning altogether. The misspelled words could get kicked out and be replaced with N possibilities:
"I {gnat, want, what...} {yup, you...}"
These permutations could be compared against a large corpus to discover the most likely combination. Somehow, I also need to wend the state of the context into the probability model.
Furthermore, the same method should be able to handle positional alternatives. Consider "I you want", or "I you wnat". That leads to a list containing stuff like:
I you want
you want I
I want you
gnat you I
yup gnat I
etc...
A large corpus statistical model could help sort this out. Once the best alternative is found--then apply it to grammar. Just some thoughts...
January 17, 2010
I am looking at several Windows project and I need to organize my thoughts.
•The client is out of
date--uses SAPI4. Needs to be revised to SAPI5, perhaps.
•Nothing ever really happened with Pandora, so that
could go.
•I want an interface to the windows version of
Brainhat.
•I need to do something for a front-end to my WiFi PA
project.
The third item is maybe the most important. The fourth is a separate project, however I could use SAPI5 in it to make an aber kuhl produkt.
I’d like to make use of SAPI5 (probably time for SAPI6 now!). I could migrate the SAPI4 elements, but I understand that the two APIs are very different. Or! I could strip the SAPI4 out of the client and make it text-only for the moment. It probably should start up, spawn a copy of Brainhat and link to it through a pipe, or whatever it’s called in the Windows world.
So, looking at my own notes, I see three projects: 1) strip out the SAPI4 components from the Brainhat client. 2) work on SAPI5 components for my PA system. 3) once I understand SAPI5, re-add speech components to the Brainhat client.
I’m on my way to Montreal to learn about image compression products from Haivision this week. It’s probably going to get into the middle of whatever I start. How could it be otherwise?
Okay... I begin by rearranging the web site a bit. Then I get the Windows version of Brainhat back up and running. Then I strip the client in a way that will allow me to add back the new components. Anything to do with hints, N-best and the like will remain in the client, though not useful for the moment.
(later) I rearranged the web site a bit. I got the Windows version up and running. I want to get the daemon code working too, though. Good progress for a couple of hours’ work.
February 5, 2010
I’m having a little problem with something, and I haven’t put my finger on it yet. I am getting this:
>> hi hello. are You a man? >> yes You are a man. this is totally huge... You are a man. You might have a girlfriend. she. do You have a girlfriend? >>
See the lone ’she?’ That’s a turd. It doesn’t always happen. That is to say, it happens deterministically, but not in every type of input. If I start with ’I am a man’, I don’t get the problem, for instance.
I traced the utterance of ’she’ to this rule:
define ccsent-top label ccsent rule $r‘csent‘0[ $r‘csent‘1] map ROOT
The purpose of this rule is to allow one sentence to follow another immediately. As in "No. I like ham." If I hack off the optional second sentence, the output looks okay. I am wondering if punctuation is the issue. I am wondering what/why something is left in the input buffer to cause this problem. Could it have to do with caching and meme-shifting?
Here’s the output from ptrace to support that notion:
find: rule csubobj4 matched: a girlfriend. find: rule tell-all-csubobj matched: a girlfriend. find: rule sent-nomod matched: a girlfriend. find: rule sent-action-10 matched: You might have a girlfriend. find: rule sent-partial matched: You might have a girlfriend. find: rule csent-common matched: You might have a girlfriend. You might have a girlfriend. find: rule tell-all-csubobj matched: a girlfriend. find: rule sent-partial matched: a girlfriend. find: rule sent-partial matched: a girlfriend. find: rule subobj1 matched: a girlfriend. find: rule sent-partial matched: a girlfriend. find: rule csent-common matched: a girlfriend. she. find: rule ccsent-top matched: You might have a girlfriend.
Or! perhaps the buffer isn’t being effectively cauterized following the first match... that is, the last bit to match ("a girlfriend") is still hanging around? I bet that’s it.
February 6, 2010
This fixed it:
rule $r‘csent‘0[.][?][!][ $r‘csent‘1]
Trying to fix a segv that comes with ’what is that?’ There’s some monkey business with caching level 2 in find.c It almost looks like I am not using the input cache on the first pass? I’m not. MEMETEST has to be set. Is caching at odds with using imperatives? It’s possible that MEMETEST should be separate from another switch, say CACHERULES? Gar.
I fixed it? There was too much processing going on in nmeme.c. I changed something. But I didn’t mean to fix everything.... Oh... false alarm... I have cache level 2 turned off still... It’s better. But when I turn on level 2 caching it is still screwing up meme-shifting. I wonder if I care... Gar. I do.
February 7, 2010
I’m thinking that there needs to be some cleanup of the level 2 caching. Or maybe I should just leave it off.
Some of the rules probably shouldn’t be cached. I haven’t found a good example of this, but it occurs to me that there are mprocs that probably want to be exercised (such as the currently-named checkcommand (soon to be imperative)), but won’t when the cache is active. Or worse, a rule could set a flag that would never again be reset. If I find support for these suspicion, I could add a "no-cache" directive to rules. Or, I could add the preproc and postproc command chains to the cached data and exercise them on a cache hit. That might work....
The first go-round of a rule--before meme testing--isn’t cached at all. I don’t know why this is. Either the caching works or it doesn’t. Why wait until meme-testing to turn it on? The fix would be a CACHE/NOCACHE pair of some sort, independent of MEMETEST.
Caching code checks meme IDs. I don’t know why. It also checks a ’serviceid’, which changes at that start of each rule exercise. That is, every trip through match_first wipes out the previous cached data. Who cares whether the meme ID is different? The old data are tossed anyway.
The memory for the cached rules isn’t being cleaned up.
(later) Gar! I’m just shutting cache level 2 off. Level 1 does the heavy lifting anyway. Maybe I’ll look at it never. Would never be okay?
I cleaned up the memory management.
Small things to do:
• change references to
"future-imperfect" to
"hypothetical-subjunctive"
• clean up the mess with speech buffers
• get all the globals in one place and give them thread
IDs
• rename "checkcommand" to
"imperative" and clean it all up.
I was also thinking about every session being saved-state. State saving seems to work pretty well, though I still need to provide for dynamic concepts (like names) and save a few other bits, like saved questions and current meme setting. If I had that all wrapped up, then I could save the value sbrk(0) after I’ve loaded up the vocabulary. Then, each input cycle could be a brk (back to initialized state), load state, run, save state and start again.
• look into: dowd@netbook:~/brainhat/brainhat$ ./brainhat +repeat +memetest Segmentation fault
• look into: if a meme lacks a trailing line feed, the last statement gets left out.
February 8, 2010
>> what is a dog a dog is dogs. >> is a dog a pet? maybe. a dog is a pet. >>
Look into this. Also, avoid the plural answering a dog is dogs. Answer a dog is a pet.
>> what is a dog? find: rule subobj1 matched: a dog? find: rule subobj-q-1-na matched: a dog? find: rule subobj-q-1 matched: a dog? find: rule subobj-q-1 matched: a dog? find: rule csubobj-q-4 matched: a dog? find: rule csubobj-q-4 matched: a dog? find: rule subobj-q-1 matched: a dog? find: rule subobj-q-1 matched: a dog? find: rule csubobj-q-2 matched: a dog? find: rule question-what-1 matched: what is a dog? find: rule csent-questions matched: what is a dog? a dog is dogs.find: rule ccsent-top matched: what is a dog? >> is a dog a pet? find: rule subobj1 matched: a dog a pet? find: rule subobj-q-1 matched: a dog a pet? find: rule subobj-q-1 matched: a dog a pet? find: rule csubobj-q-2 matched: a dog a pet? find: rule subobj-q-1 matched: a dog a pet? find: rule subobj-q-1 matched: a dog a pet? find: rule csubobj-q-4 matched: a dog a pet? find: rule csubobj-q-4 matched: a dog a pet? find: rule subobj-q-1 matched: a pet? find: rule subobj-q-1 matched: a pet? find: rule csubobj-q-4 matched: a pet? find: rule question-title-2 matched: is a dog a pet? find: rule csent-questions matched: is a dog a pet? maybe. a dog is a pet.find: rule ccsent-top matched: is a dog a pet?
question-title-2 is calling pullwhats. pullwhats looks like it deals with immediate parents--more appropriate for asking a man" would be better answered with a simple vrfy_achild (?).
>> is a dog a pet? entering chooseone (mproc) Break in chooseone at the start: A conchain is available. debug> dump define Root-88d0 [999964975] (95f90e0) clean* label question-title-2 requires pet-1* subject dog-8601* verb tobe-2* define Root-88cb [999964980] (95f6ad8) clean* label question-title-2 requires pet-1* subject dog-1* verb tobe-2*
The problem here is that chooseone is going to pick the dirty copy. The dirty copy has a dirty dog. The dirty dog is missing all parents except ’dogs’.
See here: define dog-1 [4293] (95f6b38) justcloned* label dog label dog-1 article a-8602* number singular* child-of pet-1* child-of dogs-1* define dog-8601 [4293] (95f95a0) dirty* label dog label dog-1 child-of dogs-1* number singular* article a-8602* requires things-87e9*
February 9, 2010
Here’s what pullwhats currently does:
creates a result (a conchainnode) with a number of copies of each passed-in concept, each with a single parent. pullwhats is being called after chooseone, which is reducing the number of possibilities; chooseone is choosing the dirty copy of the thing in question. the dirty copy has already made a pass through pullwhats in a previous question, so it has only a single parent left.
I’d like a modified pullwhats that is capable of:
taking a whole conchain of possibilities. I will assume that they are all the same concept (?). that is, if the first element is dog-1, then the second element will be another instantiation of dog-1. (If not, there’ll be some interesting side effects.) In any case, whatever the first concept on the string is, that’s the one we are going to work with.
Create a naked copy of the concept (dog-1), without parents.
Then make a whole bunch of copies of the concept (dog-1), each with one parent derived from all the copies of dog-1 on the passed-in conchain, subject to the following:
1) there are no duplicates in the same tense,
2) the number (singular/plural) of the copies with parents that match the number (singular/plural) of the concept are placed at the head of the resulting conchain. This way when someone asks "what are dogs?" they are more likely to get "dogs are pets". Likewise "what is a dog?" is more likely to fetch "dog is pet".