article

Palantir

[WHAT]

] by __ @ __ - private company,

[WHY]

] data analysis
] data consolidation
]

[WHERE]

] READ THE FULL ARTICLE

]

] headquarters: Palo Alto, Ca

[WHEN]

] 2017-01-28

[EXAMPLE]

] founder(s) = Peter Theil,

[HOW-TO]

]

[REFERENCE]

] SRC = HN, comments

[RELATED]

[2007] So what do you guys do? (?2007?)
- SRC= https://web.archive.org/web/20110929172325/http://blog.palantirtech.com/2007/12/04/what-do-we-do/ (SEE below)
[2007] Palantir: embodying a 50-year-old vision of the future?
-SRC = https://web.archive.org/web/20111002074059/http://blog.palantirtech.com/2007/03/16/human-computer-symbiosis/ ( SEE below)
[2008] screenshots
-SRC - https://web.archive.org/web/20110929172030/http://blog.palantirtech.com/2008/07/04/palantir-screenshots-round-two/
[2009] how to rock an algorithms interview, (2009?) SRC = hn.algolia?='algorithms'
[2017] are there any open source alternatives to palantir, SRC=best-of-hn-2017-01-dd
] + art RECENTER on interviewing / hiring process ???

Palantir: so what is it you guys do?

December 4th, 2007 | Kevin

I often ask candidates if they’re familiar with what we do at Palantir. Most people think they are. “Oh, you’re that data viz. company,” or, worse, “You guys do data mining, right?” At least they’ve heard of us and at least they’re on the right track, but I cringe anyway. We aren’t just a “data visualization” company and we don’t do “data mining.” It’s almost impossible to convey the scope and complexity of what we do in a few short minutes—or to do so without taking the conversation to an eye-glazing level of abstraction.

The following is my attempt at describing what we do at a high level without oversimplifying. I hope that after reading this a candidate will ‘get’ what we’re about, or at least understand enough not to apply tiny labels to our expansive vision.

The problem: implementing analysis

At Palantir we specialize in analysis.

Yes, that’s painfully abstract, and I’ll get to it in a second.

In real-world terms, we are building a software platform that enables people to take whatever data is relevant to them and understand it more easily and thoroughly than ever before, using concepts that they already understand. And we are applying this vision, at first, to solving problems in the finance sector and the government intelligence community.

The first important thing to note is that we don’t actually do the analysis ourselves. We don’t devise winning trading strategies and we don’t catch terrorists. We write software that enables other people to pull off these feats. These people, experts in their respective fields, are called analysts.

So what exactly do analysts do? What is analysis?

Analysis is everything necessary to extract insight from information.

Let’s break that down a bit.

Information is easy: It’s data. It lives in a relational database or as files indexed on a hard drive, and you can easily run queries against it. It comes in two forms, structured and unstructured. And there is a lot of it in the modern world – too much, actually, for current tools to make sense of.

Insight is trickier. Insight is something only a person can generate, and understanding this is critical for any organization that wants to do analysis right. Thus the challenge of data analysis is how to bring vast amounts of information into productive contact with human intelligence. In other words, the challenge is how to enable the analyst.

From the analyst’s perspective there are five essential features of an analysis platform:

First, and most important, the analyst should be in control. In other words, the primary way of interacting with an analysis tool should be human-driven queries. While automated approaches can complement a human-driven approach, there simply is no substitute for human intelligence. Unless you put a person behind the wheel, the system can never be flexible or creative enough to uncover truly original insight. Artificial Intelligence just isn’t there yet.
Ability to summarize large data sets. Some of this is what has traditionally been called data mining: the largely automated approach—using machine learning or other statistical techniques—of processing lots of data at once and extracting nuggets that capture something interesting about the data. Unlike Palantir, traditional approaches have focused almost exclusively on this aspect of analysis.
Ability to visualize large data sets. Here the analyst wants interesting and informative ways of viewing data graphically, to make it easier for him to digest. The analyst wants more than just a summary of the data; he wants a nuanced view of what’s going on inside these data sets: What’s the overall shape of the distribution? What are the outliers? What are important structures within the data?
Ability to iterate rapidly. This means enabling the analyst to ask a question, get the answer, and then quickly ask either a variant on the initial question or a follow-up question that depends on the answer to the initial question. This rapid, iterative process allows the analyst to quickly test out hypotheses and develop theories about what’s going on in the data, and by extension to discover what’s going on in the world.
Ability to collaborate with other analysts. Getting a handle on a terabyte of data, especially when it comprises multiple data types, is definitely more than a one-person job. Any organization that’s serious about understanding the world needs a team of analysts that can work together as more than the sum of its parts. This requires the ability for one analyst to effortlessly share the results of his analysis with his colleagues.

The Palantir approach

That’s what analysis looks like to the analyst, or rather what it should look like in an ideal world. (Current tools fall far short of this vision.) So what do we do at Palantir in order to make analysis this smooth and easy?

You could say that we help summarize large data sets, in the sense that we have to provide the analyst with a rich library of techniques and algorithms. You could also say that we do visualization, in the sense that we have to provide the analyst with a set of interesting and informative ways of visualizing their data. We do both of these things, and we have to be creative and solve hard problems in order to add value in these areas. But we do a lot more than that.

Probably the most central hard problem that we address in trying to enable the analyst is data modeling, the process of figuring out what data types are relevant to a domain, defining what they represent in the world, and deciding how to represent them in the system. At Palantir we make sure our data model (ontology) is both flexible and dynamic, and that it mirrors the concepts people naturally use when reasoning about the domain. This is no small challenge, but we’re already making it a reality. In finance our basic data types include financial instruments, dates, portfolios, indices, and strategies—the same things that financial researchers think about, talk about, and reason with. In the intelligence product our basic data types include people, places, and events (all with associated properties), which is exactly the way we all represent the world in our minds.

Data modeling, data summarization, and data visualization are the core disciplines for approaching large data sets. Human-driven queries, rapid iteration, and collaboration are multipliers, taking the power unlocked by the core disciplines to the next level. When these pieces are brought together in a coherent system, the result is in an analysis platform both very generic and very powerful.

This is what we mean when we say that we’re changing the way people approach data. Welcome to the future of analysis.

Palantir: embodying a 50-year-old vision of the future?

March 16th, 2007 | Ari

Here at Palantir, Charles Cooper’s recent piece on CNET about J. C. R. Licklider has struck us as a very timely piece of journalism.

Licklider was an very influential man, with Cooper even crediting him for the existence of Computer Science as a modern-day field:

Until Licklider began his work at ARPA, there were no Ph.D. programs in computer science at American universities. That changed after ARPA began handing out grants to promising students, a practice that convinced MIT, Stanford, UC Berkeley and Carnegie Mellon to start their own graduate programs in computer science in 1965. Maybe that should go down as Licklider’s most lasting legacy.

In the piece, Cooper references this influential and well known work by Licklider: Man-Computer Symbiosis, by J. C. R. Licklider, published in IRE Transactions on Human Factors in Electronics, volume HFE-1, pages 4-11, March 1960.

That’s right, it was written almost 50 years ago. That said, it’s incredibly relevant today, perhaps more than ever.

Here’s the abstract:

Man-computer symbiosis is an expected development in cooperative interaction between men and electronic computers. It will involve very close coupling between the human and the electronic members of the partnership. The main aims are 1) to let computers facilitate formulative thinking as they now facilitate the solution of formulated problems, and 2) to enable men and computers to cooperate in making decisions and controlling complex situations without inflexible dependence on predetermined programs. In the anticipated symbiotic partnership, men will set the goals, formulate the hypotheses, determine the criteria, and perform the evaluations. Computing machines will do the routinizable work that must be done to prepare the way for insights and decisions in technical and scientific thinking. Preliminary analyses indicate that the symbiotic partnership will perform intellectual operations much more effectively than man alone can perform them. Prerequisites for the achievement of the effective, cooperative association include developments in computer time sharing, in memory components, in memory organization, in programming languages, and in input and output equipment.

This description is still a pretty accurate description of how most analysts (in any industry or field) go about their business:

Despite the fact that there is a voluminous literature on thinking and problem solving, including intensive case-history studies of the process of invention, I could find nothing comparable to a time-and-motion-study analysis of the mental work of a person engaged in a scientific or technical enterprise. In the spring and summer of 1957, therefore, I tried to keep track of what one moderately technical person actually did during the hours he regarded as devoted to work. Although I was aware of the inadequacy of the sampling, I served as my own subject.
…
About 85 per cent of my “thinking” time was spent getting into a position to think, to make a decision, to learn something I needed to know. Much more time went into finding or obtaining information than into digesting it. Hours went into the plotting of graphs, and other hours into instructing an assistant how to plot. When the graphs were finished, the relations were obvious at once, but the plotting had to be done in order to make them so.
…
Throughout the period I examined, in short, my “thinking” time was devoted mainly to activities that were essentially clerical or mechanical: searching, calculating, plotting, transforming, determining the logical or dynamic consequences of a set of assumptions or hypotheses, preparing the way for a decision or an insight. Moreover, my choices of what to attempt and what not to attempt were determined to an embarrassingly great extent by considerations of clerical feasibility, not intellectual capability.

This quote is an eerily accurate description of how trading strategies are formulated, back-tested, and implemented these days. As analogy, it’s also an accurate reflection of the modern use of information processing in the intelligence space.

To wit:

It is to bring computing machines effectively into processes of thinking that must go on in “real time,” time that moves too fast to permit using computers in conventional ways. Imagine trying, for example, to direct a battle with the aid of a computer on such a schedule as this. You formulate your problem today. Tomorrow you spend with a programmer. Next week the computer devotes 5 minutes to assembling your program and 47 seconds to calculating the answer to your problem. You get a sheet of paper 20 feet long, full of numbers that, instead of providing a final solution, only suggest a tactic that should be explored by simulation. Obviously, the battle would be over before the second step in its planning was begun. To think in interaction with a computer in the same way that you think with a colleague whose competence supplements your own will require much tighter coupling between man and machine than is suggested by the example and than is possible today.

So what how does this relate to what we do? In the finance world, much of what fund managers and analysts do in building strategies has to do with formulating trading models and then building spreadsheets that can back test or simulate the performance of those models.

Our finance tool obviates the need for this “clerical, mechanical” work, allowing strategists to spend more time making sense of the interconnections in the market and formulating nuanced trading strategies and less time doing model-building in Excel. We take the state-of-art a quantum leap forward in terms of financial analysis: rather than even just allowing analysts to quickly build models and back test trading strategies, we’ve built a tool that allows for a smooth flow from hypothesis to theory with the software doing all the heavy lifting, data wrangling, eye-candy-class presentation. New variables or market conditions can be incorporated on the fly without the need for a pause from high-level thinking to gather data or marshal it into the right format. Knowledge can be divined by asking questions relative to high-level concepts of things like dynamic market conditions and meta-conditions like the volatility-of-volatility.

The question has traditionally been, “How do I effectively model this financial space?” With Palantir, we’re transforming that question into the core question asked in the finance industry, namely, “How can I better understand the interactions at work in today’s markets?” So the focus moves to the human-level questions while the software takes care of the data level machinations.

In the intelligence space, the composite views of data that the government team creates save the analysts from having to painstakingly research and record correlations across multiple informational domains. Instead, the analyst can spend time divining the meaning behind the connections and correlations. Our take on perpetual analytics takes things a step further, alerting the analyst as relevant new information enters the system. And finally, we’re building workflows that allow analysts to quickly attach ‘handles’ to data to allow what has been traditionally unstructured data get seat at this table of computer-enhanced human analysis.

We’re speeding up the process of analysis by creating an analyst-computer symbiosis. No longer will people need to spend time doing menial data processing, the computers will do it for them, while the humans provide the spark of insight, semantics, and cognition that computers lack.

It’s conceptual analysis at the speed of thought.

This is why I’m excited to come to work every day: we’re building the software that embodies a broad vision of the future. This vision of human computer symbiosis dates from five decades ago but is also apparent in every interaction we see with computers on the big and small screens (no, not our monitors). From Star Trek to 24, people want to the computers to do the repetitive and time-consuming simple work but let them have final say on any complex decisions. As one of our customers told us when shown our application: this is the future.

Details Photos Edit more