An Antithetical Post On How Narrowing Is The Key to Curated Data

So this whole thing about curation , has my head in a state, where I am seeing the data, meta-data, and users, as distinct entities in three-dimensional space. I’d love to provide an image of how they are related, but I can’t because when it comes to placing them in a 2-D or even 3-D state, there is warping and tunneling between these objects, outside of the third-dimension, to maintain proper relations.

Still here? Good. This post may be a bit vague, I’m going to try and keep it simple and understandable, for you as well as myself, I’m already a bit confused after several hours of trying to map this. If you would like to discuss this, for a more in depth, though possibly less coherent form, feel free.

To begin, we have three entities: data, meta-data, and users. These entities all have various ranges of relationship, which go from near to distant, and occasionally don’t exist. To describe the range as an example of friends, “Those best-friends, with very similar taste, are near(1), friends, much different taste(2), acquaintances, similar taste(3), acquaintances, different taste(4), and people you’ve never met(0).” We’ll approach range using this method, based on relational distance, between entities.

Data is, in my view, the front facing objects, whether that be text, images, video, or even tactile objects. Data itself exists in a weak presence, as far as to what value it represents, when coupled with meta-data, it becomes stronger.

Meta-data is data about data. It is the entity that is manipulated and understood, to provide us with relationship information, on any level. There are many forms of meta-data, temporal, location, authorship, topics, etc., that provide us with fantastic ways of connecting data, but often times it includes disparate entities, that aren’t necessary.

The user in my case is a human which interprets the regular data, and may create tags of meta-data, but can be a machine in which case it is likely to work with meta-data, either directly or in composition of meta-data from data sources.

Now that the entities are somewhat defined, I can get into the discussion of how these various entities are connected in creating relevant connections, both in basic terms, and user specific terms.

Often times, the simplest way to construct a relevancy map between data objects, is to use meta-data about the objects, social-bookmarking tools work this way by way of topical tagging, the distance between objects is the range of 4. Making the system a bit more complex you add methods, you take your tagged set, and add in user selection, by how much a user likes various items to manipulate what topics they are likely to see, this is in the range of 3 because it is still picking out items by topic which is a very wide. Or you can provide what your user’s friends have read recently, this is still in the range of 3, because by adding in what other people read, can narrow the area of focus, it’s possible to be in areas that the user doesn’t care as much for. If you add in what the user’s friends like, rather than just what they read, you get closer to the range of 2.

In order to get to the optimal range 1 you have to add two more things to your system: direct relations between data-objects and concentrated interaction between users, these can both be defined explicitly by users, and can be shown as a simple social-graph, with one object/user in the center, and the closest elements near by.  Direct-relations, which are somewhat like Techmeme, can be created on a broad scale by a user-based system of bundling links to content, based on relationship. Concentrated Interaction is a bit more complex, because it requires an analysis of interaction, but presents an interesting system, helps reach the range of 1.

Note: If you treat Users like data-objects, which they are in a database, you can apply meta-data, to make the concentrated interaction, more specific by what topics the user is most familiar.

So I’ve discussed 5 ways in varying levels of implementation to reduce the range of relevancy.

The use of tagging to create a quick reduction in the range of relevant data.
User selection to narrow down what topics the user likes, or aggregate content that the users friends are looking at.
Further narrow it down by what these friends like.
Allow Bundling of content that is directly related.
Analyze the concentrated interaction graph to narrow down trust sources.

I’m sure I’ve lost someone in this antithetical pile, as I had to get this off my head it was driving me crazy, and I’m going to call it the beginning of a new arcling, to be adjusted down the line. So if  you are interested, I’m sure that we can possibly make it a bit clearer by having a discussion.

  • This looks like a great start to getting your arms around an abstract representation of relevancy.
    Two comments:
    When you described the features that drive relevancy and the range of data, I couldn't help but feel that each end user may have slightly different tastes and priorities. This can change with mood as well to further complicate things.

    Your solution has as much artistic creativity as it has algorithmic functionality. Since you are tackling a very human problem, it makes sense that creative methods will be as much about art as they are technology.

    I'm fascinated by the same stuff, and look forward to how you implement your ideas.

    One idea I had for content compression is an image matrix that represents the tags/entities people and semantic engines assign to content in a feed or stream. A simple JavaScript/jquery version is on github. We're looking forward to open sourcing what we develop at Victus Media (OpenGard.in) as soon as we have a functional setup. You may find some more good tools there as well (I hope).

    Be well Jimminy 🙂

  • This looks like a great start to getting your arms around an abstract representation of relevancy.
    Two comments:
    When you described the features that drive relevancy and the range of data, I couldn't help but feel that each end user may have slightly different tastes and priorities. This can change with mood as well to further complicate things.

    Your solution has as much artistic creativity as it has algorithmic functionality. Since you are tackling a very human problem, it makes sense that creative methods will be as much about art as they are technology.

    I'm fascinated by the same stuff, and look forward to how you implement your ideas.

    One idea I had for content compression is an image matrix that represents the tags/entities people and semantic engines assign to content in a feed or stream. A simple JavaScript/jquery version is on github. We're looking forward to open sourcing what we develop at Victus Media (OpenGard.in) as soon as we have a functional setup. You may find some more good tools there as well (I hope).

    Be well Jimminy 🙂