Develop AI systems to efficiently learn new skills and solve open-ended problems, rather than depend exclusively on AI systems trained with extensive datasets
I think I have enough to grasp at where I can start to try and hash out the fundamental methods that are a part of the reasoning process.
WIP
sight()
The initial vision check of the puzzle. Look at each input and output pair, look at all the train examples, and any other granularities of pattern recognition, and use this information to come up with some "classification" we can use to narrow the search space. Does that mean narrow set of operations we use during simulations? I think so.
Makes no sense to try translation operations for a puzzle when there's only recoloring to do. But again, we shouldn't try to actually classify the puzzle as one of these, but perhaps a fuzzy, learned representation of it (embedding). As we train this one all the publicly available training puzzles, we're learning which operations are best used when, which should lead to faster and faster solve times.
search()
There's some kind of fractal search process we want to do. It should be directed and informed by the output we have available, but it's the application of subroutines that we do to try and get nearer and nearer to the expected output.
How is it that this isn't a blind search? We don't exactly have a gradient giving us the direction of modification (operation) to apply, and non-differential search algorithms like genetic algorithms don't seem like the answer either.
We use the output information heavily when deciding the next operation to apply.
Here's a common pattern we're likely to see. Lets say you have two subroutines:
Structure recognition
This subroutine, (does it have params?) finds structures that should be together. I think we don't need external params because we have the output, but we will likely learn internal params each time it's run. Each instance of structure recognition that's ran will have a learned internal state.
Structure rotation 90 degrees CW
This is just two operations, but the interesting thing happens when we chain them together.
First, we identity the major structures of the input and output. We can tell something is a structure when one or more of it's properties (num blocks, color, shape, etc.) is maintained throughout the input and output. It may be more complex than that though.
Then, we apply the rotation subroutine to the structures themselves, not the whole puzzle. We could continue to chain rotation commands on the prior one to get variation of the 90 degree rotation, like if we needed to go 270 degrees. We don't need to have 270 degree rotation as a prewritten subroutine because we can make it just fine.
There's obviously some trade-off in terms of expressivity and search complexity. Larger subroutines can get you further without needing to chain (therefore search longer), but if there is an intermediate step you need to have subroutines that can still get you there.
Maybe there are subroutines that sort of take you backwards in the process, like one subroutine that translates a structure to the edge of the universe, but then another that only takes one step.
Discover other posts like this one