An Antithetical Post On How Narrowing Is The Key to Curated Data

So this whole thing about curation , has my head in a state, where I am seeing the data, meta-data, and users, as distinct entities in three-dimensional space. I’d love to provide an image of how they are related, but I can’t because when it comes to placing them in a 2-D or even 3-D state, there is warping and tunneling between these objects, outside of the third-dimension, to maintain proper relations.

Still here? Good. This post may be a bit vague, I’m going to try and keep it simple and understandable, for you as well as myself, I’m already a bit confused after several hours of trying to map this. If you would like to discuss this, for a more in depth, though possibly less coherent form, feel free.

To begin, we have three entities: data, meta-data, and users. These entities all have various ranges of relationship, which go from near to distant, and occasionally don’t exist. To describe the range as an example of friends, “Those best-friends, with very similar taste, are near(1), friends, much different taste(2), acquaintances, similar taste(3), acquaintances, different taste(4), and people you’ve never met(0).” We’ll approach range using this method, based on relational distance, between entities.

Data is, in my view, the front facing objects, whether that be text, images, video, or even tactile objects. Data itself exists in a weak presence, as far as to what value it represents, when coupled with meta-data, it becomes stronger.

Meta-data is data about data. It is the entity that is manipulated and understood, to provide us with relationship information, on any level. There are many forms of meta-data, temporal, location, authorship, topics, etc., that provide us with fantastic ways of connecting data, but often times it includes disparate entities, that aren’t necessary.

The user in my case is a human which interprets the regular data, and may create tags of meta-data, but can be a machine in which case it is likely to work with meta-data, either directly or in composition of meta-data from data sources.

Now that the entities are somewhat defined, I can get into the discussion of how these various entities are connected in creating relevant connections, both in basic terms, and user specific terms.

Often times, the simplest way to construct a relevancy map between data objects, is to use meta-data about the objects, social-bookmarking tools work this way by way of topical tagging, the distance between objects is the range of 4. Making the system a bit more complex you add methods, you take your tagged set, and add in user selection, by how much a user likes various items to manipulate what topics they are likely to see, this is in the range of 3 because it is still picking out items by topic which is a very wide. Or you can provide what your user’s friends have read recently, this is still in the range of 3, because by adding in what other people read, can narrow the area of focus, it’s possible to be in areas that the user doesn’t care as much for. If you add in what the user’s friends like, rather than just what they read, you get closer to the range of 2.

In order to get to the optimal range 1 you have to add two more things to your system: direct relations between data-objects and concentrated interaction between users, these can both be defined explicitly by users, and can be shown as a simple social-graph, with one object/user in the center, and the closest elements near by.  Direct-relations, which are somewhat like Techmeme, can be created on a broad scale by a user-based system of bundling links to content, based on relationship. Concentrated Interaction is a bit more complex, because it requires an analysis of interaction, but presents an interesting system, helps reach the range of 1.

Note: If you treat Users like data-objects, which they are in a database, you can apply meta-data, to make the concentrated interaction, more specific by what topics the user is most familiar.

So I’ve discussed 5 ways in varying levels of implementation to reduce the range of relevancy.

The use of tagging to create a quick reduction in the range of relevant data.
User selection to narrow down what topics the user likes, or aggregate content that the users friends are looking at.
Further narrow it down by what these friends like.
Allow Bundling of content that is directly related.
Analyze the concentrated interaction graph to narrow down trust sources.

I’m sure I’ve lost someone in this antithetical pile, as I had to get this off my head it was driving me crazy, and I’m going to call it the beginning of a new arcling, to be adjusted down the line. So if  you are interested, I’m sure that we can possibly make it a bit clearer by having a discussion.

The Future Of Privacy Is Full Publicy

Zuckerburg was right, “privacy was no longer a ‘social norm’,” being public is the new social norm, though most people will still tend to reject reality, even myself. I’ve finally gotten over about 90% of privacy issues, I might get upset by/at them, but even if there is something exposed, I’m preparing for it now. Anyone under the age of 21, within the US, who has ever used the internet has already lost their identity, so why should they worry, about what any company is exposing about them? It’s time to get over these feelings and accept the change that is coming, a ton of privacy isn’t worth an ounce of knowledgeable protection.

Just the other day, Facebook, proposed an update to their privacy policy to allow third-parties to have access to your data, some point in the future, and with this comes, yet, another wave of criticism, some. People are jumping all over Facebook, because they feel people will be paranoid that their data is vulnerable, and that their data shouldn’t be given out willie-nillie to just any third-party site that Facebook comes to agreement with. You would think people would be used to this type of position coming from Facebook, by now, this is their fourth or fifth slip up, but still people complain for a few months and then calm down, until it happens again.

Our most personal data in the US, social security numbers, is insecure, especially if you were born after 1988. The numbers can be defined through 2 data points, date & location of birth, and a little brute forcing. So for the younger generation, nothing is private, not even our government provided personal identification. If we aren’t protected in that regard, should we really be worried about those images from last weekend or who our friends are, what our opinions are? I think Eric Schmidt said it best, in an interview where he discussed privacy, “If you have something that you don’t want anyone to know, maybe you shouldn’t be doing it in the first place.”

I know I jumped on Facebook, but they aren’t the only sites that have huge inventories of data on their users, in hopes of adding relevancy, Google, Yahoo, Microsoft, et al. Facebook is the simplest site to jump on because of it’s repeated transgressions in the area. Google has faced it as well, though, when it didn’t take enough discretion in opening up their Gmail users privacy through Buzz. As the web keeps advancing, privacy options are going to be set to off on default, it will be up to the users to change the settings to keep themselves private, this has been called ‘publicy’.

Are you prepared for the next generation, the age of publicy? Are you ready to get dirty mucking around with settings to protect what little privacy, you will have in the future? Will you let everything go, and change how you interact on the web? These are questions that we will all face, but I think I’m prepared to be completely open in my environment when it comes to social matters, they aren’t anything compared to my financial information or my social security number, which can apparently be brute forced by a bot-net of 10,000 machines in ~1.27 seconds.

Update: Tyler Romeo’s latest post, Why I Dislike Facebook & Foursquare, makes a great point in contrast to the opinions I made here, I agree with quite a bit of what he has to say as far as respecting your users and offering secure protocols, to help protect your users. Take your time and go check that post out.

Social Geo-Location Is A Weak Medium

Earlier, I was watching an Iron Maiden concert and realized that any decent medium can be used to express a story or culture. Social Geo-location might be able to pass a story, but the majority of the usage I’ve seen, thus far, doesn’t. This is just one of a few issues that make social geo-location weak, there is the issue of user base, barrier to entry, and application of the data.

I feel that the location services aren’t proper for expressing the story. They don’t describe the why and what is happening the majority of the time, and when they do the data is extremely condensed to fit within the minuscule boxes of Twitter or SMS. Twitter is hard enough to express a story through, though you can still manage to get it or a cultural message across in one tweet. Sharing a cultural message through one of these locations is likely even harder, with the exception of religious establishments.

How social can you really be with these applications? These applications all have tiny user bases, even after quite a bit of promotion on large blogs and a period of time. Foursquare, which is one the most publicly discussed ones, only has half-a-million, even after breaking out at SxSW, last year. Compared to Foursquare, few of the other services come close in size comparisons. The problem with low user adoption is that without your friends, how relevant can the product be, which I’ll discuss a little later.

The barrier to entry for nearly all of these services, is that they are limited to internet enabled phones, or smart phones. In fact, only one service of the several that I’ve looked at, had a entry level that wasn’t quite restrictive of it’s base, and it’s none other than Foursquare, with SMS check-in’s, which still appears to be hit or miss. If you’re reducing your initial growth capabilities, immediately, in a social market, you’re damaging your product.

The services use the location data, in their own ways, but I don’t know if they are applying it where it would actually be of value, as an addition of context. If you can take the data from these products and connect it to events and people as they occur, you simplify the enrichment of the story. It’s still pretty easy to just say where the event’s took place, with the addition of maybe 2 dozen key strokes, as I write this at my house.

Another issue is that the product might not be relevant to users, especially, when people begin using them to check in as they leave. If I were to use these services, it would be to let my friends know where I am, so now you have users undermining the principles of your product, way to go. You’re app actually ends up being even more irrelevant than it already is. The likelihood that your friends are even on the service is an anomaly in the first place, unless you live in a metropolitan area(e.g. New York, San Francisco, LA, Portland, Miami, etc.).

I give all the people who work on these applications props, though, because they discovered a great system. They created a user-promotion based advertising system, which you encourage by having deals with various venues to reward the heavy users, and little trophies for reaching little milestones for the rest of the users. They have also brought the idea of geo-location to the fore, which sometime in the future will be used to add context to real stories or cultural messages. So I would like to thank all the people, who work on these apps, for their work, but you guys apparently don’t understand geo-location, it is better served to add context to other mediums, than as an independent social medium.

Splitting the Web Markets

I’ve been looking into the web, trying to figure out what it’s going to look like in a few years. I’m still looking at various scopes, but I decided to analyze some of the more generalized markets that we have right now. You’re not going to find anything new here, just 5 areas of the web we will see changes in, and the coming monetization of the web.

Infrastructure = Hosting & ISP’s

Data Resources = Data

Data Access & Storage Protocols = API’s

Services = Applications that modify the Data through use of API’s to provide a value

Directories = Provide the ability to find what you’re looking for quite rapidly, can be pseudo-static or dynamic.

Each of these different markets can and most likely will be monetized within the coming years, most likely coming from the users themselves. Hosting & ISP’s have already done it. Directories that aren’t fully dynamic can do it with advertising, and even some of the dynamic real-time directories will be able to use the advertising model. The Data & DASP’s will be subsidized, for the most part, by the initial service’s charges, or possibly the service will be subsidized by external developers paying for access to the data, or just the data itself.

The benefits we will see is that our data is more stable, at least in the sense that the company isn’t going to go belly up, services should be better, and there will be more positions, hopefully. We all walked around expecting everything to be free, when we should have been asking how can we help make more services. Maybe the free world was just the accelerant for innovation to get the initial business models developed, promote an open generation, and allow everyone a shot at getting their ideas out there, it’s easier to pick up users, for a simple service, when you’re not charging them after all. The problem that we had with free is that we all became so jaded by it.

Focus on one of these markets and how you can change it. Each one is easily branched into another, you can traverse up or down that list from where you started. Look at Google, they exist in each of these markets. They started with a DASP that collected vast amounts of Data, then used initially used this data to create a Directory Service, along with quite a few other services, one of which is AppEngine which exists to share their infrastructure.

As the web evolves we’ll see these markets split and converge on each other time and time again, we may even see a new general market pop up. Just as an example of the splitting a market look at the services, there are so many sub-markets that exist within it that it would be hard to categorize them. For an example of convergence you just have to look at the various projects being developed to better connect the web, one of the most recent one’s to pop into my radar is Salmon, which is working to pull comments back to the original source and re-disperse them with the source feeds. Time to watch the ebb and flow, and maybe enter one or more of these markets.

Thoughts are Evolutionary: The Idea for Arclings

Do you really want to keep pushing ideas out, but have problems fleshing the concept out fully? Or maybe you just want to express the basis of an idea really quick, get feedback, and iterate. The problem with current systems is it’s hard to keep track of the evolution, if you post a lot of other stuff around it.

Micro-blogging lets you throw the idea out there, but doesn’t allow much room for the idea to evolve, or tracking this evolution.

Blogging in the conventional sense is much too concrete(though I’m doing it right now). I find the preconception of blogging to be you must push out a full thought. Why?

I propose a release quick, release often blogging structure and build arc’s as your story develops, making branching trees using link structures. Let the ideas build over weeks, or months, rather than waiting for one single burst of insight, and fleshing it out on the spot.

I propose using story arcs, along with links to the latest preceding events in the evolution, and trackbacks to the succeeding story events. Though this is possible in the current evolution of blogging systems, it’s complicated. I want an Arcling platform that makes the connection process easy, if not intelligent in managing the tracing of the structure.