Computational Biology postdoc working on Proteomics at Harvard Medical School. R user and fanboy. Debianite and Free Software dilettante. Gamer with a Job.
6 stories
·
2 followers

Is Social Media Losing Ground To Email Newsletters?

1 Comment
"My favorite new social network doesn't incessantly spam me with notifications," brags New York Times technology writer Mike Isaac. "When I post, I'm not bombarded with @mentions from bots and trolls. And after I use it, I don't worry about ads following me around the web. "That's because my new social network is an email newsletter." Every week or so, I blast it out to a few thousand people who have signed up to read my musings. Some of them email back, occasionally leading to a thoughtful conversation. It's still early in the experiment, but I think I love it. The newsletter is not a new phenomenon. But there is a growing interest among those who are disenchanted with social media in what writer Craig Mod has called "the world's oldest networked publishing platform." For us, the inbox is becoming a more attractive medium than the news feed... For me, the change has happened slowly, but the reasons for it were unmistakable. Every time I was on Twitter, I felt worse. I worried about being too connected to my phone, too wrapped up in the latest Twitter dunks... Now, when I feel the urge to tweet an idea that I think is worth expounding on, I save it for my newsletter... It's much more fun than mediating political fights between relatives on my Facebook page or decoding the latest Twitter dustup... "You don't have to fight an algorithm to reach your audience," Casey Newton, a journalist who writes The Interface, a daily newsletter for technology news site The Verge, told me. "With newsletters, we can rebuild all of the direct connections to people we lost when the social web came along." The article suggests a broader movement away from Facebook's worldview to more private ways of sharing, like Slack . "We felt this growing sense of despair in traditional social media," says the CEO of Substack, makers of a newsletter-writing software. "Twitter, Facebook, etc. -- they've all incentivized certain negative patterns."

Read more of this story at Slashdot.

Read the whole story
dnusinow
1860 days ago
reply
This is totally me lately
Boston, MA
Share this story
Delete

The Working Person's Guide to the Industry That Might Kill Your Company

1 Share

Just like $2.5 trillion worth of companies around the world, the company publishing the story you are now reading is owned by private equity firms. Most people have little idea what the private equity industry actually does. The truth is terrifying.

Read more...

Read the whole story
dnusinow
2216 days ago
reply
Boston, MA
Share this story
Delete

Continuum between anecdote and data

1 Comment and 2 Shares

The difference between anecdotal evidence and data is overstated. People often have in mind this dividing line where observations on one side are worthless and observations on the other side are trustworthy. But there’s no such dividing line. Observations are data, but some observations are more valuable than others, and there’s a continuum of value.

Rib eye steak

I believe rib eye steaks stakes are better for you than rat poison. My basis for that belief is anecdotal evidence. People who have eaten rib eye steaks stakes have fared better than people who have eaten rat poison. I don’t have exact numbers on that, but I’m pretty sure it’s true. I have more confidence in that than in any clinical trial conclusion.

Hearsay evidence about food isn’t very valuable, per observation, but since millions of people have eaten steak stake for thousands of years, the cumulative weight of evidence is pretty good that steak stake is harmless if not good for you. The number of people who have eaten rat poison is much smaller, but given the large effect size, there’s ample reason to suspect that eating rat poison is a bad idea.

Now suppose you want to get more specific and determine whether rib eye steaks are good for you in particular. (I wouldn’t suggest trying rat poison.) Suppose you’ve noticed that you feel better after eating a steak. Is that an anecdote or data? What if you look back through your diary and noticed that every mention of eating steak lately has been followed by some remark about feeling better than usual. Is that data? What if you decide to flip a coin each day for the next month and eat steak if the coin comes up heads and tofu otherwise. Each of these steps is an improvement, but there’s no magical line you cross between anecdote and data.

Suppose you’re destructively testing the strength of concrete samples. There are better and worse ways to conduct such experiments, but each sample gives you valuable data. If you test 10 samples and they all withstand two tons of force per square inch, you have good reason to believe the concrete the samples were taken from can withstand such force. But if you test a drug on 10 patients, you can’t have the same confidence that the drug is effective. Human subjects are more complicated than concrete samples. Concrete samples aren’t subject to placebo effects. Also, cause and effect are more clear for concrete. If you apply a load and the sample breaks, you can assume the load caused the failure. If you treat a human for a disease and they recover, you can’t be as sure that the treatment caused the recovery. That doesn’t mean medical observations aren’t data.

Carefully collected observations in one area may be less statistically valuable than anecdotal observations in another. Observations are never ideal. There’s always some degree of bias, effects that can’t be controlled, etc. There’s no quantum leap between useless anecdotes and perfectly informative data. Some data are easy to draw inference from, but data that’s harder to understand doesn’t fail to be data.

Read the whole story
dnusinow
2975 days ago
reply
John Cook gives a nice explanation about how anecdotes are still data, and the reality is far more complicated than a binary.
Boston, MA
Share this story
Delete

September 28, 2013

2 Comments and 17 Shares

Early update because I'll be at Festiblog all day!
Read the whole story
RangerRick
3860 days ago
reply
Cloacamazing!
Raleigh, NC
dnusinow
3858 days ago
reply
Boston, MA
Share this story
Delete
1 public comment
adamgurri
3863 days ago
reply
Animal pals
New York, NY

5 reasons to love logarithms

1 Share

I was discussing with a maths minded friend about the difference between "quantitative" and "non quantitative" science, mainly on how biology had to get its quantitative mojo back, and I said that a good proxy for whether someone was "quantitative" or not is whether they are at home with logarithms - do they use them, are they comfortable about logs between 0 and 1, can they read log plots?

Logarithms are extremely useful in many scenarios. To remind the readers who are a bit rusty about them - a logarithm is the number which you have to raise another number (called the "base of logarithm") to make a third number. You set the base as a constant, and for most scenarios the base actually doesn't matter (it will shift some offsets or some scales, but not change the shape anything). So - if we set the base to 10, (notation is log10 ) then log10 of 100 is 2, (10 squared is 100), log10of 1000 is 3 (1000 is 10 cubed, so 10 to the power of 3) etc. log10of 1 is 0, and that's true for every base. A slightly looking glass world happens between 0 and 1, where the log of these numbers are negative: log10of 1/100 is -2, as it is 1 over of 10 squared. Logs are not defined for 0 or below (at least for real numbers - one would have use complex numbers). Notice that the "space" between 0 and 1 maps to the space between 0 and negative 1 and infinity - one of the nice illustrations of how rational numbers have an infinite number of numbers between them.

When I was first taught about logarithms I found the base both a bit inelegant and also a bit annoying - what base should I "use"? But I've come to realise that the base just scales things. log2(10) is 3.3(ish) - log2(100) is 6.6(ish) - log10(10) is 1, log10(100) is 2. There are 3 commonly used bases: base 10, which might be verbalised as "orders of magnitude" e.g: "we are going to need two orders magnitude more disk for that metagenomics experiments", base 2, which is verbalised as "bits", and is particularly useful for probabilities/information content, eg "the distribution of amino acids in this column has at least 8 bits of information". The final base is "e" - one of these magical mathematical constants which pops up all over the place (along with Pi, 1, 0 and others). e can be defined in a variety of ways (I always like the simple sum 1+ 1/1 + 1/(1*2) + 1/(1*2*3)... etc until infinity...). logehas a large number of nice mathematical properties, so it's called the "natural" logarithm, and the "e" is usually dropped, so it is just written as log. But - don't forget - the base doesn't matter really - it's the logarithm aspect.


So - why use them? Here are 5 good reasons to use logs in biology (and many other places!):


  1. Using the logarithm of a scale makes multiplications of that scale just change an offset - this is because log(xy) = log(x) + log(y), so multiplying something by 2 or 10 just adds a constant number. There are quite a few experimental things where properties stay constant under multiplication - doubling the experiment might double the variation in absolute terms, but it will just give an offset in log space. As technology development is often a multiplier on the previous years technology development, log space is much better way to look at technology development, whether it is disk space, sequencing output or bandwidth. (As an aside, this is why many financial charts are better read in log(price) rather than price; if you are interested in volatility for example, you are interested not it is absolute level, but rather it's multiplicative level).
  2. log(scale) is a very pragmatic way to compress a large range numbers, even if you are not sure that the reason why you've got a large set. This is very common in biology where for example one or two genes might be pumping out at an extreme level of RNA, whereas all the other genes are just doing their normal thing. Plot this on a linear scale and everything has to fit in these extreme cases. A log scale is often a far nice scheme to see everything simultaneously, even if there is not a good "reason" for using a log scale. Which leads nicely onto the next case...
  3. There are a whole bunch of process where process involves an exponential decay or exponential increase (due to something for example happening with constant probability of changing). Here the log() of the read out might well give nice straight lines - for example, frequencies of alleles in the population is often better plotted in a log scale. Here not only does the log scale let you get everything into one plot, but it is also "the right space to work in" - complex looking behaviours might end up looking like (for example) two straight lines in log space.
  4. log(probability) both (a) stretches the whole probability space out and (b) "correctly" weights the edges of the space over the centre of the space. For example, something going from 80% to 90% to 95% to 97.5% and then 98.75% accuracy is halving its error rate each time. Plotting the log of (1-accuracy) (ie, log of the error) shows this trend far better than looking at these numbers. Without taking logs it's easy to think that there is not much difference between 99% accurate and 99.99% accurate. In fact, there is huge difference. When you are dealing with likelihoods (probabilities of something happening under a model), the "raw" likelihood is very hard to have any intuition about - log(likelihood) makes these large negative numbers (and then only gotcha is that you are looking for the smaller negative number, ie the one closer to 0, as the "better" one).
  5. Ratios are often better visualised, and sometimes better used as log(ratio). Raw ratios of two things (x/y) will have everything crammed between 0 and 1 when y is greater than x, but when x is greater than y it seemingly can go on forever. Plotting this looks odd. log(x/y) though is now nicely balanced, with just the same amount of "space" on side of 0 (0 being 1:1).

So - here are my 5 reasons. Logarithms are so pervasive that there are many other reasons to love them. I was in the generation where we were allowed minimally functional calculators rather than slide rules and logarithm tables to do more complex arithmetic, and I know some people found the very practical process of using slide rules helpful in gaining an intuition about them. If you still find logarithms a bit mysterious, I encourage you to play around a bit with just some artificial cases (eg, make up some ratios and see how log(ratio) works). R of course has three built in functions log (natural logs), log10 and log2, all of which I use regularly.




Read the whole story
dnusinow
3902 days ago
reply
Boston, MA
Share this story
Delete

How the NSA Could Stop Sucking and Be Awesome Instead

2 Comments and 3 Shares

I applied for a job at the National Security Agency in 2006.

I was about to finish my undergrad degree in Mathematics. My favorite classes were abstract algebra, a subject whose only footing in reality is cryptography. I also felt some amount of national pride: my high school was a boarding school in Connecticut, so I experienced September 11, 2001, with many friends from New York City.

There’s an inscription in the marble floor of the building at my alma mater that houses the Mathematics department: "Reality favors symmetries and slight anachronisms". Only now do I appreciate that quote, since I received my full-time offer from Google while I was in the interview process with NSA.

I e-mailed my NSA recruiter and said thanks but no thanks.

I’ve still got that memory of what it feels like to have nationalistic pride in an organization that’s on the forefront of mathematics, computer science, and engineering. Sadly, that power has now been turned inward, but I think it’s possible to fix NSA’s image, and use it to make America a better place.

It should go without saying that NSA needs real oversight, and needs to stop spying on Americans. After that, though, I think there are some concrete things that NSA could do to redeem itself, and maybe even attract talent.

Open Source Code

For 99.9% of developers, cryptography is very easy to get wrong. Even in well respected open source packages, there are obscure issues, like the OpenSSL Pseudo-random Number Generator bug that broke SSH badly. It was caused by a developer removing some seemingly do-nothing code at Valgrind’s recommendation.

Recommendation 1: NSA could provide open source reference implementations of cryptographic and other security-sensitive code.

Open source, and thus peer-reviewed code provided by the largest body of elite mathematicians and cryptographers in the world? Yes, please. One less thing to worry about it.

Public Key Signing for American Citizens

NSA seems to have a problem identifying the communications of American citizens. If you think about it from a machine intelligence perspective, that’s pretty hard indeed.

Furthermore, PGP users have a difficult time with key exchanges. How do I know the public key you sent me is really your public key? Ideally, it’s been signed by somebody I trust.

This is a place to kill two birds with one bureaucratic stone.

Recommendation 2: NSA could provide an optional service to sign PGP keys as belonging to American citizens.

I already have federally-issued documentation of my citizenship, my US Passport. There ought to be a way to get my PGP key signed by the government, so I can sign my messages as an American citizen, having the government be the trusted authority on that matter.

This is interesting because it doesn’t compromise my privacy. My private key is still private, but the government, through a verification process similar to the passport process, has declared they trust me to be an American citizen.

This could be added as a signal for NSA collection systems, since the NSA ought to trust its own key-signing authority, it can be absolutely sure that an encrypted communication it intercepts is from an American citizen, and thus discard it.

This is a less-terrible-more-useful version of a National ID card, since it doesn’t expose my secrets, but allows me to assert my identity to other parties. Nothing would force me to use NSA’s key-signing service, just as nothing forces me to get a passport or a Facebook account.

Security Training for American Developers

Like I said, cryptography is very hard to get right. Not just algorithms, but protocols as well. What if NSA could help us Americans get it right?

Recommendation 3: NSA could provide a training program for American software and IT professionals on security best practices. For bonus points, the cost of this program could be tax-deductible.

American developers have a security responsibility to keep our trade secrets within our borders, and NSA can help us with that.  It’s not reasonable to allow NSA to patrol our electronic borders itself, but it could help on-the-ground implementers do it right.


I think it’s possible, with the right amount of congressional and judicial oversight, for the NSA to genuinely make America a better place.

We rely on government services for things that ought to be essential to a productive society, like a court system, a military, and infrastructure. Security is becoming one of those fundamental things, as we rely more and more on computers.

China is hacking us. Russia probably is, too. The NSA could be a point of pride and utility for us Americans to keep our economy strong, and safe from foreign invasion.

Until then, though, I’m done using Google products, e-mail, and unencrypted text messaging.

Read the whole story
dnusinow
3902 days ago
reply
Some really interesting ideas here. Public keysigning as a government service is particularly novel to me.
Boston, MA
Share this story
Delete
1 public comment
fabuloso
3892 days ago
reply
not bad
Miami Beach, FL