Automated recap of the latest activity in #kaggle-arc-agi, created by @hermes.
Calling all developers, philosophers, and creatives!
Let share a little background about what we're working on over here at #kaggle-arc-agi. If you're interested in working on it with us, DM me and I'll help you get up to speed!
Throwing down some ideas @akashvshroff and I had this morning.
One of the ideas that stuck out was some concept of an evolving search space.
There's some kind of initial filtering we do to narrow down what kind of problem we're working on. A quick scan of the input and outputs let's us know if it's a recoloring, translation, pattern extraction, or something else which we might not have seen before.
I think I have enough to grasp at where I can start to try and hash out the fundamental methods that are a part of the reasoning process.
WIP
sight()
The initial vision check of the puzzle. Look at each input and output pair, look at all the train examples, and any other granularities of pattern recognition, and use this information to come up with some "classification" we can use to narrow the search space. Does that mean narrow set of operations we use during simulations? I think so.
Collecting some thoughts @rishi and I had as we discussed the challenge and how we might go about coming up with a solution.
No doubts about this being a hard challenge. But naiveté may be our greatest strength.
A few things that stick out from the conversation:
The intent of the Kaggle-ARC-AGI team is to develop AI systems that can efficiently learn new skills and solve open-ended problems without relying exclusively on extensive datasets. Below is a thematic and chronological synthesis of the team's recent insights and challenges, along with some potential connections and missing links that were observed.
WIP
I like the idea that we're building a "mind" that writes it's own subroutines based on how it analyzes the puzzle at hand.
I don't think I want to try an LLM to generate these though, even though that may be something it could do. Too much variability in the code it would write, too slow, etc.
Reasoning doesn't guarantee an optimal solution (e.g. fewest transformations to output), but it should still generate a solution none the less.
As I think a little about some of the transformations we'd make on the holographically encoded input, I know that the set of operations I come up with will not be exhaustive.
In reasoning, it seems to me that there are multiple resolutions or granularities we can take as we work to solve the problem.
This would be the first step, where we learn what kind of task we need to reason about.
Before even getting into tackling the reasoning aspect of ARC, I'm thinking about how we encode the data.
I think we'd be handicapped by traditional methods because of the sequential nature of those encoding methods. So much of reasoning relies on being able to see the whole of the input and output at once.
This is another cool one. Lot's going on, but most of it is noise. Similar to the previous puzzle, in the sense that it solidifies the idea to keep a concrete representation of the input and output, and operations should be applied as apposed to prediction.
I like this one. It looks like so much is going on, yet actually very little is going on. We have a very complex, noisy background, but that's not part of the challenge at all. You just need to replicate it to the output, and worry about the yellow and added red positions.
Some kind of crystal growth? This one's pretty easy to figure out as the colors (excl. background) are consistent across examples.
We have some dark blue nucleation points, from which a fixed structure "grows" from. The longer blue bars are also a part of this crystal.
This puzzle shows the need to multi-level grouping recognition. Usually, we'd look at the smaller structures and count those as groupings, but we see here that the collection of 2x2 blocks is also a structure. Then, we can learn the rule that within the structure, we color red, and outside the structure we color blue.
The order in which we perform tasks in the algorithm certainly matters. We see here that we are expanding the bars towards the gray bar, but we need to start with the bar furthest from the gray to ensure that we get the overlay ordering right.
Hardest one I've seen so far! We have a mapping tiling challenge here.
We use the yellow divider to recognize that within each training input, we have a sort of input-output expectation.
We next need to recognize how the structures of the input map to the new structure of the output. We use the colors to find that the order of colors in the output (top to bottom) comes from the colors of the input in a snake pattern (i.e for four structures: top-left, bottom-left, top-right, bottom-right). This was a little more difficult because there are different ordering patterns from input to output.
This one's purely a recoloring exercise. But the rules to recolor are not so simple and require some reasoning to figure out.
The rule is: for forms with 3 or more pieces, keep the color. Otherwise, transform to green.
The rule set is so simple, but it took a little time to figure out because it's not a direct or linear transformation. Some structures changed color, while others did not.
This one tripped me up a bit. I thought the pattern was to repeat the outer colors diagonally and inwardly, skipping a space as you go, until you could no more and then ending with a gray piece.
While it was a pattern that was present, there was another pattern that was the right answer. Instead of tiling inwards, all you needed to do was draw one set of inward points around a center gray piece. The gray piece is at the point equidistant to all three outer squares.
Connect the dots with a gray line. Start at red (or green) and go to yellow then the last. Use Manhattan geometry. Make sure to keep the original points as they are.
Hard to think of a physical process where we could simulate this or inform us of the rules.
This one was interesting. Made me think a lot about how we use color to reason about the significance of something. When you have a very colorful set of structures, then a gray, black, or white section, you can usually understand their meanings to be different.
Piece of cake with this one, but really cool. This problem (ARC-AGI as a whole) is just so cool because it shows the differences in human reasoning and machine "intelligence".
This pattern is easy for a human to pick up, but so much harder for an AI.
We've seen one like this before. A set theory problem. Again, a separate color divider helps us understand it's splitting the input into two sections which we can compare to each other as the dimensions are the same.
In this one, we overlay upper on bottom, and look for the squares where both sections don't have a fill (NOR).
A classic tiling problem. We are given the input form, and we need to replicate it four times in the output, starting with the bottom right corner, the reflecting over the Y axis to make bottom left, then reflecting over the X axis to get the top section (which we could also break down as top left and top right.