EVE Evolved: Why you should participate in EVE’s Project Discovery

Back at EVE Vegas 2015, CCP Games unveiled an ambitious project that aimed to involve EVE Online players in some really exciting scientific research that could make a big difference in the real world. CCP has been working with researchers from the Human Protein Atlas project on a way to gamify their research and integrate it directly into EVE in a way that respects the game lore. The Project Discovery minigame went live this week, and it’s been a big hit with the playerbase so far, with almost half a million submissions from over 23,000 players in the first day alone.

The minigame tasks players with identifying highlighted cell structures from fluorescent images in exchange for ISK and Analysis Kredits that can be used to buy some shiny new Sisters of EVE items. Project Discovery can be opened from the side bar whether you’re docked or in space, making it a good way to kill some time while you’re waiting for something to happen. The task can be a bit tricky at first, but some players have already become expert classifiers with hundreds of submissions and accuracy ratings of over 90%.

In this edition of EVE Evolved, I delve into Project Discovery, link a few great community guides, and highlight some serious problems with it that have unfortunately appeared.

providence2What is the actual research?

You may remember the monumental news back in 2003 that the Human Genome Project had successfully sequenced all 20,000 of the protein-coding genes in a reference human genome for the first time. This was a colossal scientific achievement, but it was just the first step in a much larger scientific process. Each of those genes sequenced codes for a particular protein or piece of functional RNA, and it’s important to understand what function each of the proteins serve in a human cell. The Human Protein Atlas aims to help answer that question by figuring out exactly what parts of the cell each protein is used in.

The atlas is an important resource for scientists around the world who are working on research projects involving human genetics. Researchers working on potential treatments for genetic disorders, for example, could get clues about the mechanism behind the disorder by looking at where the affected proteins are normally expressed in a healthy cell. Figuring out where a protein is expressed in a cell can also give clues as to its function and could lead to new treatments for a variety of medical conditions.

cellsampleHow to play the minigame

When you first launch Project Discovery, you’ll be presented with a tutorial that will walk you through some pre-classified images and explain how the interface works. You’re presented with an image of some cells with red, blue and green components, and your job is to identify which cell structures are coloured in green. The red component of the image is always cytoskeleton microtubules, the blue component is always the cell nucleus, and the green colour is the part of the cell you’re trying to identify. Each image represents a single protein that’s being investigated, and the green areas are everywhere that protein shows up.

The same protein can often be found in different parts of the cell, so the goal is to select every part of the cell that you think is being stained in green. Hovering over the various cell component options on the right hand side will show you a few example images, but you shouldn’t just classify images based on a visual match with the examples. Each cell component has an accompanying description that provides a much better explanation of exactly what you’re looking for to identify that part. For example, nucleoplasm should only be selected if it the green area overlaps completely with the blue colour, nucleoli always overlap with the holes in the blue colour, and plasma membrane is usually visible outside the red colour.

The colour options below the slide will let you toggle each individual colour on and off so that you can see whether the protein is overlapping a key feature like this. I find that it’s best to start each new image by selecting only the green colour to look for obvious features, and then toggle the red and blue on and off to look for overlaps. Remember that you can mouse over the slide to zoom in on a part of it, and click on the slide to lock that view in place so that you can have a really detailed look at the cell as you switch between the colours.  If you find something that doesn’t match any of the options, you can also click the Abnormal Sample checkbox to flag the sample for review by researchers.

cytoplasmDon’t just click Cytoplasm

When you submit your classification, you get instant feedback showing you what percentage of players selected which options for that particular slide. This exposes some problems with the system, however, as the community often can’t make up its mind and the most popular choices are sometimes the wrong answer. The problem is that the rewards for Project Discovery are based on how quickly you can blitz through samples and whether or not your answer agrees with the community consensus, even if that consensus is wrong.

In a game like EVE where players routinely manipulate the in-game markets and have a history of exploiting game mechanics for a profit, this is potentially very dangerous. If players try to abuse the system by creating a false consensus in order to farm points, then the data retrieved from the project will not be useful. As there’s an incentive to respond as quickly as possible, most players will also select only one option even when multiple exist, so those who find multiple features are reportedly punished.

akreditsPreventing abuse

Abuse is supposed to be prevented by the accuracy rating system, which cuts off rewards if a player’s accuracy drops below 30%. I decided to test this by creating a new character and simply clicking cytoplasm every time, and unfortunately discovered that the system is abusable. After selecting only cytoplasm for over 250 submissions, I reached rank 24 with over 10,000 Analysis Kredits and maintained a steady accuracy rating of over 55%. Every now and then, you come across a pre-classified training image that’s probably designed to catch out bots, but it’s always a very obvious and clean image and the answer is usually nucleoli.

It’s clear that even the tiny financial rewards on offer in Project Discovery can corrupt the integrity of the data being collected. We need a lot more pre-classified training images of different types mixed in to trip up people mis-identifying things, along with harsher punishments for getting them wrong and explanations like we get during the tutorial. Perhaps the immediate financial rewards should even be entirely scrapped and instead players could get rewards when they reach certain combinations of rank and accuracy rating (using only confirmed submissions that have reached a consensus). That way players who click cytoplasm all day would never get any rewards as they’d never break the 70-80% accuracy mark.

finaltghoughts

The Human Protein Atlas is an extremely worthwhile project that can make a huge difference to scientific research around the world, and gamifying it as Project Discovery was a fantastic idea. While it’s clear that the game mechanics can be abused right now, the people at HPA are really engaging with the EVE community on this and are already working to correct problems that people have brought up.

I highly recommend that every EVE player has a go at Project Discovery, but don’t do it for the tiny ISK rewards or the new Sisters of EVE swag. Do it to learn more about the different parts of the cell, to help with an incredibly important piece of scientific research, and for that slim chance of finding an abnormal sample that could lead to an interesting new discovery. Do it for science!

EVE Online expert Brendan ‘Nyphur’ Drain has been playing EVE for over a decade and writing the regular EVE Evolved column since 2008. The column covers everything from in-depth EVE guides and news breakdowns to game design discussions and opinion pieces. If there’s a topic you’d love to see covered, drop him a comment or send mail to brendan@massivelyop.com!
SHARE THIS ARTICLE
Code of Conduct | Edit Your Profile | Commenting FAQ | Badge Reclamation | Badge Key

LEAVE A COMMENT

26 Comments on "EVE Evolved: Why you should participate in EVE’s Project Discovery"

Subscribe to:
Sort by:   newest | oldest | most liked
DevinPSullivan
Guest
DevinPSullivan

peppzr Awesome! Thanks for the help and keep up the great work! :)

peppzr
Guest
peppzr

I thought this was only a minigame in EVE.

Have been doing it a lot and went to 1% and after analyzing I am at 93%.

Love to contribute with some real life research :)

Zennie
Guest
Zennie

Nope. This is the best community to do citizen science with. People in EVE are used to find glitches and abuse the game mechanics, but they also love science and they are used to cooperate with devs in fixing those glitches etc. So you get the most critical audience and they are actually trying to help.
(Inb4 EVE players are psychopaths for pvping in a pvp game)

kgptzac
Guest
kgptzac

If this research speeds up our species’ evolution into infomorphs like Eve’s capsuleers then it’s double awesome!

MorpayneRADIO
Guest
MorpayneRADIO

These people chose the wrong community for this experiment. All they’re going to do is exploit as much isk out of it as possible through nefarious means would be my guess.

Nyphur
Guest
Nyphur

DevinPSullivan Nyphur Incredibly interesting stuff and a great insight into what goes on at the HPA, thanks for sharing! I look forward to reading updates on Project Discovery as it progresses :D.

Loopstah
Guest
Loopstah

Lord Zorvan Boardwalker Volunteering to do scientific work is both fulfilling and voluntary. It’s a win-win.

Cyraith
Guest
Cyraith

DevinPSullivan Cyraith Nyphur Yay!

DevinPSullivan
Guest
DevinPSullivan

Cyraith Nyphur Fear not! There are changes coming…https://forums.eveonline.com/default.aspx?g=posts&m=6403702#post6403702

DevinPSullivan
Guest
DevinPSullivan

Nyphur DevinPSullivan Don’t be sorry! I love talking about this stuff! 

Yes, the fluorophores have different absorption and emission spectra. For excitation, you have a real problem…if you use a laser with higher energy than the excitation wavelength, you will still excite the fluorophore! This makes sense. If I can knock the fluorophore up an energy level with a small hit of energy, a larger one will also work, but it will get less efficient the further I go from the excitation peak.  

In terms of emission, fluorophores are generally not narrow enough to fully separate when using more than 2 colors. Here is a link where you can see the emmission spectra for some common fluorophores. Notice how much they overlap! You can see the rectangles where they have suggested filtering the wavelengths, but you can easily see that this is imperfect and gets much worse the more colors you use (for example look at DAPI vs FITC). comment image

Here is another example that shows both the excitation and emission spectra to give you the full picture:comment image

You can (and we do) pick narrow wavelength filters and wait a long time, but this also means that you have to shoot the cells with A LOT more photons. This is really bad for the cells. In our case, since the cells are dead, it’s not quite as bad because the cell won’t up-and-die on us and start exploding (this actually happens if you shine too many high-powered photons on a cell). But each fluorophore has a “photon budget” or number of photons total it can emit before it gets “tired” and can’t relax to the proper state anymore to emit a photon. There are efforts to increase this photon budget for obvious reasons, but generally speaking making your emission window too narrow wastes photon budget and takes a lot longer (even if it’s only a second per image, add that up x4 colors, x4 images per experiment x96 experiments per plate and it starts to add up). And like I said, no matter how narrow, you still get some overlap if your emission spectra are stacked too close. 

Not much you can do I’m afraid!

Lord Zorvan
Guest
Lord Zorvan

Boardwalker Lord Zorvan Playing EvE isn’t work. Doing scientific comparisons is.

Cyraith
Guest
Cyraith

Nyphur Cyraith Nyphur That would be my solution too – you complete the puzzles, but the rewards aren’t disbursed until the results are actually confirmed, and get the feedback on why it was right or wrong. 
I will say the tutorial left me a bit confused, especially with the technical terms. Not defining those mean I had long biology words with no familiarity or connection how they relate to the images. It also took my some time to really understand I’m looking at how the green overlaps the blue and red. The tutorial could use a line like, “Match the green in the sample with the green in the choices.” 
I would like if it defaulted to only showing the green to begin with. WHen classifying, I always switch to green first, and then tick the blue/red on and off. 
The biggest problem I see are the rewards aren’t tied to actually being right, and deferred compensation would be the best solution – because it’ll weed out those looking for immediate gratification, and hopefully make people put some more care in because then just winging it might mean they get away with being wrong for a little while before the real results catch up (and their accuracy, AK, and ISK is then credited.)

Nyphur
Guest
Nyphur

Cyraith I think the solution would be to not reward players for submissions until the submission reaches consensus. We should get reports on our submissions that reached consensus, what that consensus was, and what our reward was. We should also not be able to see the community’s percentages until an item reaches consensus, as getting feedback right after submission is likely to discourage people and may even get people to modify their behaviour (for example, by clicking cytoplasm all the time because everyone else is!
Some people also don’t seem to understand that they’re supposed to be identifying where the green dye is. It might help if the game initially showed us only the green and then faded in the red and blue after a few seconds, as the first thing people do is usually to switch to green anyway. Perhaps more in-depth tutorials on each of the selection options would be useful to help us identify things like the Golgi Apparatus too, and the ability to see the examples larger and switch the colours around so we can see what they look like as green only would be great.

Nyphur
Guest
Nyphur

DevinPSullivan Nyphur Ah-ha! I wondered why Brainbow used such a range of colours in its imaging but cell microscopy didn’t do the same, I didn’t really think about the fact that they have no overlap so you really can unmix the signal after the fact.
I take it that all of the fluorophores have the same absorption spectra so you can’t activate them separately with different wavelengths of lasers? And is there a reason that you can’t use an extremely narrow wavelength filter to eliminate the bleedthrough problem and collect light over a longer duration to build up a signal?
Sorry to keep pelting you with questions, I just find this whole field of science really fascinating!

Cyraith
Guest
Cyraith

Completely agree the system is abusable, and punishes even correct answers. There have been several I’ve picked which are more accurate than the community consensus, which looks like it took the easiest answer based on the thumbnail examples. And then I wonder why I can’t break past 57% accuracy. Sometimes I get it wrong, but mostly classifying it accurately is punishing instead of learning what the game is and picking the most-like-to-be-the-community answer. 

Going into Discovery, I thought accuracy would adjust once the legitimate, real, answer was confirmed – not just whether you agree with the community or not. Oh how disappointed I am.

DevinPSullivan
Guest
DevinPSullivan

Nyphur DevinPSullivan That’s exactly what we do actually! Although what you see in the game is a colored image, each image is collected in grey scale by collecting photons in a specific wavelength. These are created as you suggest by the fluorophore emitting photons of a certain wavelength (and filtered by a special mirror that only lets one set of wavelengths through it — a dichroic mirror, hence my EVE name!) 

In our experiments we use 4 fluorophores and collect images from each set of wavelengths. In the game you see those for DNA (blue), tubules (red) and our protein of interest (green). We also label the endoplasmic reticulum (yellow), which you can see on the proteinatlas.org, but not in game. 

This is about as many as you can get because fluorophores are excited/emit photons in a bit of a spread, meaning that each photon doesn’t have exactly the same energy when being emitted or require the same amount to be given off. Some new fancy systems can do 5-6 colors, but you start running into bleed through problems with signals from other fluorophores you didn’t intend to excite. 

There are some really cool techniques that basically use multiple fluorophores to emit in more than one of these spectrum and then use the relative color once the channels are added to computationally unmix which fluorophore they are looking at (like the brainbow project). This works really well when your signals don’t overlap (like they are separate cells in the brainbow project) and can get you upwards of 20 colors!! The problem is that when structures overlap, it becomes really hard to do this unmixing accurately because you can’t tell if it’s red and green because it is a different fluorophore or because there is one red thing and one green thing there. There are people working on it, but I’m not sure how much success they have had with doing it subcellularly yet because things are really really crowded in your cells.

Nyphur
Guest
Nyphur

DevinPSullivan Nyphur There’s nothing better than talking to someone who is this into their work, get as carried away as you like :D. It’s exciting to hear that you’re going down the neural networking route, so I guess a lot of the data collected in Project Discovery will help with the large volume of pre-classified images needed. I figured the microscopy might be the bottleneck and re-imaging could be infeasible, though it’s really interesting to hear about the possibility of bleaching and restaining the same cells.
Another crazy question while I have you: As I understand it, immunofluorescent antibodies can be produced in a range of different optical wavelengths. Could you select a handful of reference proteins and stain each with a different colour, then take the image using different filters over the CCD and save the data set as a series of grayscale images to be re-composited on the computer? Then the vision system could test similarity with at least a handful of references without the need to re-stain and find the same position, and it could test for similarities with multiple references at the same time to more accurately classify images that match multiple categories.

DevinPSullivan
Guest
DevinPSullivan

Nyphur DevinPSullivan To your first question, yes! As a computational biologist specializing in image analysis, this is basically my job. Together with my master’s student, we are currently working on implementing an artificial neural network approach which we hope can overcome some of the shortcomings of previous approaches. Computers are very good at image analysis, and have been used for similar tasks before, but the biggest challenge as you mentioned is that proteins may be present in any number of categories and at varying intensities. This makes the learning much less straightforward. As you point out, some of these categories could be relatively easy computationally to identify, though the goal is for a general solution.

Although it is possible to produce another set of images with different reference channels, imaging 20k proteins, in at least 3 cell lines is a big task that until recently was performed entirely manually. Hence the data in the Atlas represents ~10 years worth of microscopy. 

Even with laboratory automation which we recently implemented, you cannot escape the time-limitation of growing cells, seeding them and staining them. This takes approximately 3 days per experiment (96 wells, or 384 if you can get the cells to grow in that format) and that’s barring any errors or infections (if yeast, mold, or bacteria gets in your cells you have to throw them out and start again). As you can see, imaging every protein with every reference channel with this technology quickly becomes impractical from a time and technical standpoint, and that’s before cost is considered! 

One really neat new tech that is being developed is to remove (“wash out”) stains inside the same cells and re-stain them. The microscope then has to find the *exact* same position so that you can add the images together to get a wholistic picture of the cell. This type of technique can be done using “photo bleaching” where you basically just burn-out the old stain, or with “chemical bleaching” where you use a reagent to de-activate your fluorescent marker. With both of these methods you run some risk of damaging the other molecules in the cell, but from what I’ve seen these methods have a lot of potential. 

Ultimately what I suspect we will end up with is an automated system that predicts the location(s) with a confidence, and then if any of them have a low confidence they will be inspected by humans to confirm/update the computer’s selection. This feedback will go back into the model to improve it too! This system, while powerful is something you can’t build with lots of good training data to start with since the computer must learn the rules that you have learned like if it’s overlapping the red a lot it’s MT. 

Hope that answers some of the questions, obviously I can get carried away on this topic! :)

Nyphur
Guest
Nyphur

DevinPSullivan Awesome, and it’s fantastic to see this level of engagement on it!
The computer scientist in me keeps trying to think of ways that parts of the classification process could be automated using a computer vision system. For example, the proteins in the cytoplasmic skeleton (microtubules) are always a good visual match for the red dye and produce the distinctive yellow colour, so can’t that classification be done using software or is the problem that a lot of proteins have multiple classifications the computer would miss?
Would it be feasible to produce a line of slides with red staining on a different known reference protein in another part of the cell (such as the nucleolus)? Then you could use software to generate a percentage similarity/overlap between the reference protein and the selected one in order to pre-classify proteins into broad categories. Even if humans then have to weed through the images and throw out those that don’t actually match the computer classification or that actually have multiple classifications, that could be a simpler task with a much higher degree of success by untrained individuals.

Polyanna
Guest
Polyanna

Every once in a while, despite themselves, CCP does something truly great.

DevinPSullivan
Guest
DevinPSullivan

Great article! I really liked the experiment on exploitability. For what it’s worth, we are changing that and though I can’t say exactly what it will look like, I can say it will have many of the aspects you suggested, including an expanded training set and scaled checks that diminish with increased accuracy/rank. 

o7
HPA_Dichroic

Wandris
Guest
Wandris

Its a brilliant project, sort of along the lines of an organic supercomputer made up of human brains. Who needs an AI now?

Boardwalker
Guest
Boardwalker

Lord Zorvan I always enjoy EVE when I play. It never feels like “work”.

Lord Zorvan
Guest
Lord Zorvan

Make it where I get a free EvE sub for 10 hours of accurate clicky per month and I’ll do it. Otherwise, I don’t pay to work.

PurpleCopper
Guest
PurpleCopper

Meh, I did my part using Folding@home with my PS3.

Nordavind
Guest
Nordavind

I do not play EVE, probably never will, but I welcome any endeavour like this. It’s not like we’re doing a ton of mundane and repeating tasks in our games already. If they can help with something important, all the better.
For now, the only place I contribute to something like this, is Google translate. When I have a few minutes to spare, I go to https://translate.google.com/community and do ten translation or verification of others translations. Mostly the latter. Takes me 2-3 minutes. (And they have bades!!)

wpDiscuz