Category Archives: Communication

Squeezing Rocks with your Bare Hands

Our lab group. Photo: Chris Marone

Our lab demo group. Photo: Chris Marone

As frequent readers of the blog or listeners of the podcast will know, I really like doing outreach activities. It's one thing to do meaningful science, but another entirely to be able to share that science with the people that paid for it (taxpayers generally) and show them why what we do matters. Outreach is also a great way to get young people interested in STEAM (Science, Technology, Engineering, Art, Math). When anyone you are talking to, adult or child, gets a concept that they never understood before, the lightbulb going on is obvious and very rewarding.

Our lab group recently participated in two outreach events. I've shared about the demonstrations we commonly use before when talking about a local science fair. There are a few that probably deserve their own videos or posts, but I wanted to share one in particular that I improved upon greatly this year: Squeezing Rocks.

Awhile back I shared a video that explained how rocks are like springs. The normal demonstration we used was a granite block with strain gauges on it and a strip chart recorder... yes... with paper and pen. I thought showing lab visitors such an old piece of technology was a bit ironic after they had just heard about our lab being one of the most advanced in the world. Indeed when I started the paper feed, a few parents would chuckle at recognizing the equipment from decades ago. For the video I made an on-screen chart recorder with an Arduino. That was better, but I felt there had to be a better way yet. Young children didn't really understand graphs or time series yet. Other than making the line wiggle, they didn't really get the idea that it represented the rock deforming as they stepped on it or squeezed it.

I decided to go semi old-school with a giant analog meter to show how much the rock was deformed. I wanted to avoid a lot of analog electronics as they always get finicky to setup, so I elected to go with the solution on a chip route with a micro-controller and the HX711 load cell amplifier/digitizer. For the giant meter, I didn't think building an actual meter movement was very practical, but a servo and plexiglass setup should work.

A very early test of the meters shows it's 3D printed servo holder inside and the electronics trailing behind.

A very early test of the meters shows it's 3D printed servo holder inside and the electronics trailing behind.

Another thing I wanted to change was the rock we use for the demo. The large granite bar you stepped on was bulky and hard to transport. I also though squeezing with your hands would add to the effect. We had a small cube of granite about 2" on a side cut with a  water jet, then ground smooth. The machine shop milled out a 1/4" deep recess where I could epoxy the strain gauges.

Placing strain gauges under a magnifier with tweezers and epoxy.

Placing strain gauges under a magnifier with tweezers and epoxy.

Going into step-by-step build instructions is something I'm working on over at the project's Hack-a-Day page. I'm also getting the code and drawings together in a GitHub repository (slowly since it is job application time). Currently the instructions are lacking somewhat, but stay tuned. Checkout the video of the final product working below:

The demo was a great success. We debuted it at the AGU Exploration Station event. Penn State even wrote up a nice little article about our group. Parents and kids were amazed that they could deform the rock, and even more amazed when I told them that full scale on the meter was about 0.5µm of deformation. In other words they had compressed the rock about 1/40 the width of a single human hair.

A few lessons came out of this. Shipping an acrylic box is a bad idea. The meter was cracked on the side in return shipping. The damage is reparable, but I'm going to build a smaller (~12-18") unit with a wood frame and back and acrylic for the front panel. I also had a problem with parts breaking off the PCB in shipment. I wanted the electronics exposed for people to see, but maybe a clear case is best instead of open. I may try open one more time with a better case on it for transport. The final lesson was just how hard on equipment young kids can be. We had some enthusiastic rock squeezers, and by the end of the day the insulation on the wires to the rock was starting to crack. I'm still not sure what the best way to deal with this is, but I'm going to try a jacketed cable for starters.

Keep an eye on the project page for updates and if any big changes are made, you'll see them here on the blog as well. I'm still thinking of ways to improve this demo and a few others, but this was a giant step forward. Kids seeing a big "Rock Squeeze O Meter" was a real attention getter.

Hmm... As I'm writing this I'm thinking about a giant LED bar graph. It's easy to transport and kind of like those test your strength games at the fair... I think I better go parts shopping.

Open Science Pt. 2 - Open Data

For the second installment in our summer open science series, I’d like to talk about open data. This could very well be one of the more debated topics; I certainly know it always gets my colleagues opinions exposed very quickly in one direction or the other. I’d like to think about why we would do this, methods and challenges of open data, and close with my personal viewpoint on the topic.

What is Open Data?

Open data simply means putting data that supports your scientific arguments in a publicly available location for anyone to download, replicate your analysis, or try new kinds of analysis. This is now easier than ever for us to do with a vast array of services that offer hosting of data, code, etc. The fact that every researcher will likely have a fast internet connection makes most arguments about file size invalid, with the exception of very large (100’s of gigabytes) files. The quote below is a good starting place for our discussion:

Numerous scientists have pointed out the irony that right at the historical moment when we have the technologies to permit worldwide availability and distributed process of scientific data, broadening collaboration and accelerating the pace and depth of discovery…..we are busy locking up that data and preventing the use of correspondingly advanced technologies on knowledge.

- John Wilbanks, VP Science, Creative Commons

Why/Why-Not Open Data?

When I say that I put my data “out-there” for mass consumption, I often get strange looks from others in the field. Sometimes it is due to not being familiar with the concept, but other times it comes with the line “are you crazy?”  Let’s take a look at why and why-not to set data free.

First, let’s state the facts about why open data is good. I don’t think there is much argument on these points, then we’ll go on to address more two-sided facets of the idea. It is clear that open data has the potential to increase the friendliness of a group of knowledge workers and the ability to increase our collaboration potential. Sharing our data enables us to pull from data that has been collected by others, and gain new insights from other’s analysis and comments on our data. This can reduce the reproduction of work and hopefully increase the numbers of checks done on a particular analysis. It also gives our supporters (tax payers for most of us) the best “bang for their buck.” The more places that the same data is used, the cost per bit of knowledge extracted from it is reduced. Finally, open data prevents researchers from taking their body of knowledge “to the grave” either literally or metaphorically. Too often a grad student leaves a lab group to go on in their career and all of their data, notes, results, etc that are not published go with them. Later students have to reproduce some of the work for comparison using scant clues in papers, or email the original student and ask for the data. After some rummaging, they are likely emailed a few scattered, poorly formatted spreadsheets with some random sampling of the data that is worse than no data at all. Open data means that quality data is posted and available for anyone, including future students and future versions of yourself!

Like every coin, there is another side to open data. This side is full of “challenges.” Some of these challenges even pass the polite term and are really just full-blown problems. The biggest criticism is wondering why someone would make the data that they worked very hard to collect out in the open, for free, to be utilized by anyone and for any purpose. Maybe you plan on mining the data more yourself and are afraid that someone else will do that first. Maybe the data is very costly to collect and there is great competition to have the “best set” of data. Whatever the motivation, this complaint is not going to go away. Generally my reply to these criticisms goes along the lines of data citation. Data is becoming a commodity in any field (marketing, biology, music, geology, etc). The best way to be sure that your data is properly obtained is to make it open with citation. This means that people will use your data, because they can find it, but provide proper credit. There are a number of ways to get your data assigned a digital object identifier (DOI), including services like datacite. If anything, this protects the data collector by providing a time-stamp of doing data collection of phenomena X at a certain time with a time-stamped data entry. I’m also very hopeful that future tenure committees will begin to recognize data as a useful output, not just papers. I’ve seen too many papers that were published as a “data dump.” I believe that we are technologically past that now, if we can get past "publish or perish," we can stop these publications and just let the data speak for itself.

Another common statement is “my data is too complicated/specialized to be used by anyone else, and I don’t want it getting mis-used.” I understand the sentiment behind this statement, but often hear it as “I don’t want to dedicate time to cleaning up my data, I’ll never look at it after I publish this paper anyway.” Taking the time to clean up data for it to be made publicly available is when you have a second chance to find problems, make notes about procedures and observations, and make it clear exactly what happened during your experiment (physical or computational). I cannot even count the number of times I’ve looked back at old data and found notes to myself in the comments that helped guide me through re-analysis. These notes saved hours of time and possibly a few mistakes along the way.

Data Licensing

Like everything from software to intellectual property, open-data requires a license to work. No license on data is almost worse that no data at all because the hands of whoever finds it are legally bound to do nothing with it. There is even a PLOS article about licensing scientific software that is a good read and largely applies to data.

What data licensing options are available to you are largely a function of the country you work in and you should talk with your funding agency. The country or funding agency may limit the options you have. For example, any US publicly funded research must be available after a presidential mandate that data be open where possible “as a public good to advance government efficiency, improve accountability, and fuel private sector innovation, scientific discovery, and economic growth.” You can read all about it in the White House U.S. Open Data Action Plan. So, depending on your funding source you may be violating policy by hoarding your data.

There is one exception to this: Some data are export controlled, meaning that the government restricts what can be put out in the open for national security purposes. Generally this pertains to projects that have applications in areas such as nuclear weapons, missile guidance, sonar, and other defense department topics. Even in these cases, it is often that certain parts of the data may still be released (and should be), but it is possible that some bits of data or code may be confidential. Releasing these is a good way to end up in trouble with your government, so be sure to check. This generally applies to nuclear and mechanical engineering projects and some astrophysical projects.

File Formats

A large challenge to open data is the file formats we use to store our data. Often times the scientist will use an instrument to collect their data that stores information in a manufacturer specific, proprietary format. It is analyzed with proprietary software and a screen-shot of the results included in the publication. Posting that raw data from the instrument does no good since others must have the licensed and closed-source software to even open it. In many cases, the users pay many thousands of dollars a year for a software “seat” that allows them to use the software. If they stop paying, the software stops working… they never really own it. This is a technique that the instrument companies use to ensure continued revenue. I understand the strategy from a business perspective and understand that development is expensive, but this is the wrong business model for a research company. Especially considering that the software is generally difficult to use and poorly written.

Why do we still deal in proprietary formats? Often it is because that is what the software we use has, as mentioned above. Other times it is because legacy formats die hard. Research groups that have a large base of data in an outdated format are hesitant to update the format because it involves a lot of data maintenance. That kind of work is slow, boring, and unfunded. It’s no wonder nobody wants to do it! This is partially the fault of the funding structure, and unmaintained data is useless data and does not fulfill the “open” idea. I’m not sure what the best way to change this idea in the community is, but it must change. Recent competitions to “rescue” data from older publications are a promising start. Another, darker, reason is that some researches want to make their data obscure. Sure, it is posted online, so they claim it is “open”, but the format is poorly explained or there is no meta-data. This is a rare case, but in competitive fields can be found. This is data hoarding in the ugliest form under the guise of being open.

There are several open formats that are available for almost any kind of data including plain text, markdown, netCDF, HDF5, and TDMS. I was at a meeting a few years ago where someone argued that all data should be archived as Excel files because “you’ll always be able to open those.” My jaw dropped. Excel is a closed, XML based, format that you must have a closed-source program to open. Yes, Open Office can open those files, but compatibility can be sketchy. Stick to a format that can handle large files (unlike Excel), supports complex multi-dimensional data (unlike Excel), and has many tools in many languages to read/write it (unlike Excel).

The final format/data maintenance task is a physical format concern. Storage media changes with time. We have transitioned from tapes, floppy disks, CDs, and ZIP disks to solid state storage and large external hard-drives. I’m sure some folks have their data on large floppy disks, but haven’t had a computer to read them in years. That data is lost as well. Keeping formats updated is another thankless and unfunded task. Even modern hard-drives must be backed up and replaced after a finite shelf life to ensure data continuity. Until the funding agencies realize this, the best we can do is write in a small budget line-item to update our storage and maintain a safe and useful archive of our data.

Meta-Data

The last item I want to talk about in this already long article is meta-data. Meta-data, as the name implies, are data about the data. Without the meta-data, most data are useless. Data must be accompanied by the experimental description, relevant parameters (who, when, where, why, how, etc), and information about what each data item means. Often this data lives in the pages of the laboratory notebooks of experimenters or on scraps of paper or whiteboards for modelers. Scanners with optical character recognition (OCR) can help solve that problem in many cases.

The other problems with meta-data are human problems. We think we’ll remember something, or we don’t have time to collect it. Anytime that I’ve thought I didn’t have time to write good notes, I payed by spending much more time after the fact figuring out what happened. Collecting meta-data is something we can’t ever do enough of and need to train ourselves to do. Again, it is a thankless and unfunded job… but just do it. I’ve even just turned on a video or audio recorder before and dictated what I’m doing. If you are running a complex analysis procedure, flip on a screen capture program and make a video of doing it to explain it to your future self and anyone else who is interested.

Meta-data is also a tricky beast because we never know what to record. Generally, record everything you think is relevant, then record everything else. In rock mechanics we generally record stress conditions, but never think to write down things like temperature and humidity in the lab. Well, we never think to until someone proves that humidity makes a difference in the results. Now all of our old data could be mined to verify/refute that hypothesis, except we don’t have the meta-data of humidity. While recording everything is impossible, it is wise to record everything that you can within a reasonable budget and time commitment. Consistency is key. Recording all of the parameters every time is necessary to be useful!

Final Thoughts

Whew! That is a lot of content. I think each item has a lot unsaid still, but this is where my thinking currently sits on the topic. I think my view is rather clear, but I want to know how we can make it better. How can we share in fair and useful ways? Everyone is imperfect at this, but that shouldn’t stop us from striving for the best results we can achieve! Next time we’ll briefly mention an unplanned segment on open-notebooks, then get on to open-source software. Until then, keep collecting, documenting, and sharing. Please comment with your thoughts/opinions!

Knowing the Fundamentals

phdcomics.com

phdcomics.com

Ok, I've been sitting on this topic for awhile, but I was recently inspired to revive this post after being asked some very general questions by a tour group that came through the lab. Next time we'll be back to doing some data collection and analysis. Maybe gravity tide measurements? Anyhow, on with the topic of the day: knowing the fundamentals.

On a (now old) episode of the podcast Technical Difficulties, Gabe Weatherhead (@macdrifter) was chatting with Brett Terpstra (@ttscoff) and Rob Trew (@complexpoint). The show was mostly about how everyone got started writing computer code and some tool suggestions. One line made me stop on my walk home to type it into my reminders to write this post.

Rob quoted a line from the Windows 95 API Manual (a programming interface manual for the non-programmers out there). It said "The nature of an expert is not someone who knows all the details, it's someone who understands the fundamentals really well." Rob points out that therein lies the key to problem solving. This statement really resonated with me when I looked back on problems that I've encountered in the past, both scientific and technological.

We often think of an expert as someone that is in the top few percent of the knowledge leaders in their field. Experts should know all of the details of their subject, including the latest "bleeding edge" research right? While many experts do stay up to date, I began re-examining the people that I considered to be experts.

The professor may be the ideal example of this. While academics often get the connotation of the aloof and socially insulated genius, it's really not true. (In fact, our academic heroes are just people too, listen to the latest Nerds on Draft for that side story.) Professors have to teach the same material over and over again during their career. Sure, they should be pushing the frontiers of their field within their research group, but that's not what should be done in the education potion of the career. When you teach something, you end up deeply learning it yourself. In fact, that is part of the value in teaching! Be continually re-iterating the fundamentals to ourselves, we can stay primed to approach a new problem with a honed set of tools.

What could these fundamentals be? Well, that depends on your work. Maybe it is knowing the basics of programming or how to do basic chemical balance/thermodynamics calculations. Maybe it is knowing the fundamental operation of the product that you sell, or knowing the backstory to a concept you are helping someone with (such as the history of a topic).

I can't count the number of times that I've been trying to figure out a solution to a problem or how to build something when, after hours of no progress, something will make me start again. This time I look from a fundamentals viewpoint and can generally see a way to a solution or at least enough of the way to be able to ask an intelligent question.

Ideally, we are prepared for this way of problem solving by getting the basics of many fields during our undergraduate careers. Unfortunately that doesn't always happen. We have all sat in a math class, economics class, etc when the professor goes deep into a subject that they adore and leaves us in the dust. Another common occurrence is that the application of the fundamentals is not shown or sometimes not even implied. Not that students should be guided by the hand to the solution, but sometimes a firm nudge is necessary. I didn't necessarily appreciate this early in my undergraduate career, but later became a mass consumer of basic knowledge.

Next time you are on Amazon or in the library, browse over to a section with a topic of interest and pick up an introductory book. Read some sections, try some problems, and you'll be amazed at the other angles you can suddenly see as avenues of attack to a problem. You can even pickup some of your old text books and remind yourself of the fundamentals that all too often slip from our minds with time.

Fun Paper Fridays

Image: phdcomics.com

Image: phdcomics.com

In my last post about why I think the expert generalist is crucial in today's highly inter-related world, I mentioned a practice that I've adopted of "Fun Paper Fridays."  Today I want to briefly describe fun paper fridays and invite you to participate.

The Routine
Every friday I go to a coffee shop first thing in the morning and commence my weekly review.  During this time I check the status of projects, emails, etc and make sure that things are not slipping through the cracks.  Those of you familiar with David Allen's Getting Things Done will recognize this.  In addition to reviewing my schedule, I added a self expansion project.

Each week I pick out a paper that isn't directly related to my research and read it.  The paper can be serious, just not about my work (ex: Viking Lander 1 and 2 revisited: The characterization and detection of Martian dust devils), or it can be a completely fun topic (ex: How to construct the perfect sandcastle).  That's it! Just read a paper, no notes unless you want.  You'll be surprised when in some situation you'll recall a fact, method, or comment from one of these papers and be able to apply it to a completely different scenario.

Join Me
I hope that you'll join me in this quest of broadening your knowledge horizons. If you're not involved with science, that's no problem. Just read something that you normally wouldn't. Maybe it's the Art & Culture section of a newspaper or an Article from a popular science magazine. Every Friday I'll be posting the paper I'm reading on Facebook and Twitter. Please join me and use the tag: #FunPaperFriday.

NSF Graduate Fellowships - Some Thoughts and Tips

While this post may not appeal to the general audience, I thought it would be useful because it is an important topic to any senior undergraduate or first/second year graduate student. Today I want to briefly tell you my experience applying for the NSF Graduate Fellowship in 2012 and 2013.  I learned a lot in the process of applying for this prestigious fellowship and hope that I can pass some of that knowledge down!

Application 2012 - No award

My first year at Penn State, I applied with the traditional three documents of research statement, personal statement, and research proposal.  I sought the edits of those who had been awarded the fellowship in the past and thought I had a convincing packet assembled.  After reading, re-reading, and re-reading, it was time to submit.  I submitted the application, then made the mistake of reading over it again a week later and finding things I wished I had changed.  Months went by and seemed to drag on until the award announcements came.  I was not selected for an award.  While I was of course disappointed, it was time to kick it into high gear and make an even better application for my next (and final) try.

Application 2013 - Award Offered

For my second application I had lots of debates with myself.  Should I change my research proposal topic? Were my personal and research statements too similar? How can I improve the writing? Should I include figures?

To settle these debates, I turned to the wealth of online information that I hadn't sought out the previous year.  I talked with those who had received the award, I read funded research proposals from various professors and researchers, and I went down to the bare bones of the document.  While I'll discuss specific tips below, I'll just say that I started earlier, took more pauses between writing sprints, and sought more people for reading.

My tips

In writing two proposals, I learned a lot about how to effectively structure my research and emphasize the specific angle of attack I'll take on a research question and why it's different.  Here are some things I found to be helpful:

MY PROCESS

  • Start Early - Think it's too soon? Wrong! You need lots of time to organize your thoughts, revise, rewrite, and think about your application.
  • Read the Announcement - Print out the announcement document and read it critically.  You can look at the 2014 announcement here.  Don't just read it, mark on it. Highlight what they specifically are looking for, underline the buzzwords and key phrases of the call.  Also, draw a big box around the application deadline and then plan to beat it by one week.  Why? Computer problems, server crashes, unexpected medical emergency, etc.  You don't know what could happen, so make sure that your application gets in early!
  • Make a FastLane Account - Go to the online application and make an account.  Get familiar with FastLane, you'll use it for most all of your NSF proposals unless they change sometime in the future! Look at the application.  Go ahead and fill in the boxes with your name, address, etc.  Now you can mark down progress on your application and have momentum to move forward with the hard work.
  • Write the Requirements for your statements out on Paper - This one is huge.   In the application, pull up the research proposal and background/personal statement "prompts."  Print them, read them many times, and finally write them down on a notepad.  Break the prompt up into small chunks and then think about how to answer each piece.  Don't worry about flow, just think.
  • Brain Dump - Now write each one of the pieces of the question on the top of a page and begin to outline the points that you will make to address it.  Again, don't worry about order or how many points you have! Just write and write and write.
  • Organize into an Outline - Take a break, a day or so, then come back to your brain dump afresh and think about how you can piece it together into one coherent story - your story.  The story of a proposed research project and the story of you and your life in science.
  • Make a Draft - It does not need to be pretty, organized, the right length, etc.  Just get complete sentences onto the page.  Do this on paper or in a plain text editor.  Don't worry about formatting, length, spacing, margins... Those are things for later in a word processor.  I like using Textastic, Sublime, TextWrangler, or Editorial.
  • Read it and Have Lots of People Read it - Don't be afraid to ask everyone to read and edit your document.  Do not ask them to re-write the document for you! Remember this needs to come from your brain, but it is fine to gather suggestions and comments.  I also went to the graduate writing center and had some great suggestions from the coach there.  As scientists, we are not used to marketing ourselves and we often think the need for our research is obvious.... That won't work.
  • Talk to Your Reference Writers - You'll need letters of reference.  These take lots of time to write, so make things easy on your mentors and writers.  They have done a lot for you and are about to help out again.  I went through the application, figured out what I thought would be important to my application reviewers and then composed an email to my writers (see below).
  • Do NOT Cram - Whitespace is a dear friend to someone who is reading many pages of documents... like your judges.  Don't pack every single word you possibly can into the pages.  Economy of words shows great thought and restraint when writing.  Edit down over and over.  Leave white space between paragraphs.  If you use figures, text wrapping is a fine way to reclaim space, but leave a sufficient margin.  Look at books and other professionally formatted documents for inspiration.

MY APPLICATIONS FROM BOTH YEARS

Here are links to documents I produced for both applications, of course don't plagiarize, but hopefully they are helpful!

NSF 2012 (No Award)
Personal Statement | Previous Research | Proposed Research

NSF 2013 (Award Offered)
Research Statement | Personal & Background Statement |
Letter to Reference Writers

LINKS

There are several other helpful webpages out about the application process.  Remember, what you read on the program site is the final word, but these pages have more useful tips.

GRFP Essay Insights (Missouri)
Alex Lang's Website
Jennifer Wang's Website
Reid Berdanier's Website
The Official NSF GRFP Page
NSF PAPP Guide Book

Remember, if you don't get the award, take the feedback you get and start improving! Try, try again and don't be afraid to seek help from mentors, writers, friends, and family.  Please leave any useful comments below. Best of luck!