Sunday 9 February 2014

Libraries in their Simplest Terms

Ursula must start this post by apologising for yet another long silence. The reason for it is not winter torpor - had it been, she would have taken one look at the current weather and crawled straight back to sleep - but she has been very busy co-editing a book (Just in case anyone is interested, it is titled "Computational Biomedicine" and it will be published this summer by OUP.) She is currently awaiting the proofs, and therefore has time to take an interest again in all things library and information.

And then Aidan Baker asked her to bake a cake as a prize in a competition for Cambridge librarians that he was organising. This was to write an account of the recent Libraries at Cambridge conference (Twitter handle #lac14) using only the thousand most commonly used words in the English language. Doing this is best described as "simple but not easy"; you can tell straight away which words are allowed by typing your text in the Up-Goer Five text editor, but the vocabulary is extremely limited. The word "thousand" isn't in the list, for instance (you have to use "ten hundred"). And, more to the point, nor is the word "library". You can see the full list (in alphabetical order) here, and read a piece of "Up-Goer Five speak" by Aidan.

You can get a reasonable idea of whether Up-Goer Five is likely to allow a word by looking it up on WordCount, a surprisingly addictive little web tool that lists the 86,000 commonest words in English in order: from "the" at no. 1 to "fireballs" at no. 86,000. There is a good but not perfect correlation between Up-Goer Five's vocabulary and WordCount's top thousand; "library" is well outside at no. 1252, but "thousand" is no. 989. WordCount, however, is based on the British National Corpus, which was compiled in 1994, and thus reflects English usage about twenty years ago. "Computer" is well up at 705 (and also in the Up-Goer Five list) but "laptop" is only at 35,149 and "smartphone" doesn't appear at all. It would be interesting to know where those words would appear if the list was re-written today.

The competition was judged by a simple vote at the Cambridge librarians' brown bag lunch on Wednesday February 5, and Ursula was there to join in the discussion and present the cake. She also brought chocolate tiffin (interestingly neither "cake" nor "chocolate" appear in UpGoer Five's list). There had been a lot of interest in it, but not many actual entries. Several of those at the meeting had had a go but not got far enough to enter. One confessed that he had been stymied from the start by not being able to put the word "conference" (WordCount's #1015) in.

Discussion turned on comparison between Up-Goer Five and other basic English vocabularies. As long ago as the 1920s, the linguist and philosopher Charles Kay Ogden invented "basic English", a core vocabulary of 850 words with a simplified grammar for teaching English as a second language. Not surprisingly, Basic English does not include the word "computer", although Ogden would very likely have come across the word, which was first used to refer to a machine in the 1860s. But despite its age, and despite its even smaller size, the Basic English vocabulary seems to be more practically usable than Up-Goer Five's, probably because it deals with the words that are most useful rather than those that are most used.

In the end, those at the meeting awarded second prize - and Ursula's cake - to Diana Wood, for a summary of the whole conference. Her entry, entitled "Libraries@Cambridge 2014 - Up-Goer Five style!" reads so well that you might almost forget the limitations of the thousand-word vocabulary. Until you come to the "water car", that is. Well, with "ship" (#2140 in WordCount) and "boat" (#1888) ruled out of bounds, what else do you call such a thing? (The entry includes a handy picture of a ship in case anyone is still confused.) The first prize, a voucher from Cambridge Wine Merchants kindly donated by Lyn Bailey, Cambridge's classics librarian, was awarded to graduate trainee Emily Downes. She set herself the extra challenge of writing her entry in haiku form, and it worked extremely well. You can read it here.

Sunday 21 July 2013

"Can I Use This?"

"Can I Use This?" is the title of an interesting opinion piece by Beth Harris and Steven Zucker on the provision of digital images in museums. Their perhaps controversial suggestion was that the majority of museums and libraries who do not provide such materials free of charge are "undermining" education. Although this focused on provision in the specific discipline of art history, curators and librarians in many other disciplines are having to decide how to approach these issues.

This paper provoked a lively discussion at Cambridge University librarians' "brown bag lunch" in June, which I attended, as before bringing chocolate tiffin to cement my status as an honorary librarian for an hour or so. Several of the genuine librarians present have begun providing images as a service for readers, and none has completely solved the dilemma of whether (and if so how) to charge for them. Charging for commercial but not for academic or research use - as Aidan Baker does at the Haddon - is a common approach, but it is not the only one. One participant mentioned an instance of a Cambridge graduate student being charged to use an image generated in and held by a Cambridge University library in her thesis. And even with a policy of charging for commercial use only, it is not always easy to distinguish between the two.

And this - as well as the prospect of a lunch including tiffin - is the question that has woken Ursula from her particularly long winter torpor. I am currently editing a textbook, Computational Biomedicine, which is due to be published next year by Oxford University Press. This book will include a large number of images and other diagrams - some directly provided by our chapter contributors and others taken from published papers and reviews - and each has to be checked for copyright. I and my colleagues are in the same position as a user seeking permission to reproduce digital images from books in an academic library: and how does one classify a textbook, which is to be sold for profit but for educational purposes: commercial or academic? (Aidan thought that the Haddon would probably treat such a user as a bona fide academic.)

The Haddon Library web pages host several collections of images taken from its older books. Among these is a series of images made from sketches of Australia made in the mid-nineteenth century by the German humanist and explorer William Blandowski. The most popular of all the images in this archive is no. 41, which includes - in the background - what may be the first artistic depiction of football played by "Australia rules". Aidan charges all but academic users on a sliding scale for high resolution copies of these images, which produces a trickle of funds for the library, but almost every request causes questions like those described above.

The complete Haddon image collection (currently comprising 501 images) is held in the University's Dspace archive, "the institutional repository of the University of Cambridge". Individuals and groups attached to the university may deposit there any digital content that is their own and of a "scholarly or heritage" nature. The archive, therefore, includes an enormous variety of material besides images: the archives are broken down into categories including books, audio and video files, software and maps. All material stored there must be freely accessible, at least in some form.

These few examples illustrate just how confusing the situation can seem for any potential user of images held at Cambridge University. Once you have located and selected a useful image, it can be quite difficult to find out who owns it, how it can be used, and time consuming to bat emails backwards and forwards requesting permissions. Creative Commons provides one solution to this problem. Anyone who creates an image can choose to set out their and others' rights in it using one of a range of "creative commons licenses". There are six to choose from ranging from "CC-BY", which allows any user to do anything they like with the material, for whatever purpose, as long as the author is credited, to the most restrictive, "CC-BY-NC-ND" which forbids commercial use or any change to the original material. It is becoming more and more popular, and rightly so.

And yet - even if we could somehow get to a digital utopia in which all "scholarly material" (however defined) could be accessed and shared at will, for no matter what (legal) purpose, might there not still be something missing? And might that not be, in one participant's elegant phrase, "the physicality of the book"? Certainly, e-books have not taken over from physical books to the extent that seemed likely just a few years ago. People still like to see and to interact with physical objects. Visits to museums and art galleries are still popular, and paper books are still being sold. (It has to be added that some people have taken the desire to possess books to the most undesirable of extremes...)

And it is still possible to have the best of both worlds. There are still places where physical books can be shared freely. Support your local public library!

Saturday 8 September 2012

The Perils of Terminology (and Lunch)

For the last four or five years, librarians at Cambridge University have met once a month at lunchtime to discuss issues in library research, generally centred around a single paper or report. These meetings are termed "brown bag lunches" - after the generic, or American, term for the receptacles that hold the packed lunch that are consumed at the meetings. And I have become in some sense an honorary member of this group.

Which brings me to libraries and their terminology, and to the difficulty that users of academic libraries can have in understanding it. Last week I found myself at just such a lunch, munching chocolate tiffin and discussing a paper by John Kupersmith at the University of California, Berkeley entitled "Library Terms that Users Understand". Understanding terminology is more of a problem for many users, even academic ones, than some librarians realise. Students, in particular, can fail to realise that "Journal" and "Periodical" are used by different libraries to refer to the same type of document - and how about "Serial"? And to say nothing of "Database"... This was not such a problem in pre-Internet decades when users were more often face to face with librarians. The term "users" now refers just as often to users of library websites as to those seen walking around libraries.

After agreeing that users - and librarians - had difficulty working out what exactly was meant by the term "database", the librarians at the meeting - that is, everyone but me - spent some time discussing how they could make their websites more user friendly. Kupersmith's paper recommended avoiding acronyms and vague terms, and using "mouseovers" - now there's a term that not everyone will understand - to define any terms that are not immediately obvious. We heard of one library that has teamed up with another in Australia to provide a real-time "Ask a Librarian" service almost 24-7, and another that provided interactive maps showing the location of specific resources. Much as I admire Aidan's use of Gliffy to produce a map of his library, that is available only in hard copy.

Putting my scientific hat on, I noted that molecular biologists have problems with terminology too, but in a rather different way. While different libraries use different terms to refer to the same concept, biologists have a historical problem with the naming of genes. In the decades before the torrent of data arising from genome projects, it often happened that the same gene would be discovered at roughly the same time by several independent labs. Each group would give the gene a name, and endeavour to keep using that name to highlight their role in the discovery. Pity the poor students with three or more keywords to use when searching for papers on just one gene. One answer to this has been the Gene Ontology - a structured vocabulary for genetics that links the various terms together, listing synonyms for each one. This is only one example of many ontologies that are now widely used in biology.

I asked the librarians at the brown bag lunch whether they had any use for the concept of an ontology. They answered that they did - but they called it by a different name. And I have already forgotten what it is...

Wednesday 1 August 2012

Switzerland in London for the Olympics - and Life Sciences

A small part of London has been transformed into Switzerland for the duration of the London Olympics. Well, to be fair, it is one building: but what a building! Glaziers Hall, the headquarters of the wonderfully-named Worshipful Company of Glaziers, has temporarily become "House of Switzerland UK 2012": the Swiss Government's hospitality centre for the duration of the Games. It is tempting to speculate just how much the Swiss must have paid the glaziers to get out of town for the summer, for their building has a delightful and convenient location, sandwiched between Southwark Cathedral and the river. Even the much hyped Olympic crowds at London Bridge station failed to materialise, at least when I was there on Day 4 of the Games, at mid-day and latish in the evening.

But what has this to do with biology, or journalism, or indeed any of the usual topics of this blog? Showing that biotechnologists are not above taking advantage of the networking opportunities offered by the Olympics, the canton of Zurich organised a very interesting Life Science Day at the House of Switzerland on 30 July. The Day, more precisely a long afternoon, was divided into three sessions. The first, organised by the EU HealthTIES consortium, presented some of the most important scientific and technical advances that are likely to affect healthcare in the next decades.

The HealthTIES consortium links biotech and health organisations in five European regions - Zurich (of course); Oxford ans its environs; BioCat in Catalonia, Spain; Medical Delta in the Netherlands; and Ëszak-Alföld in Hungary - aiming to promote innovation in technology for healthcare through collaboration. The first session brought together experts from Zurich, Oxford and Hungary to discuss some of these innovations and their implications for policy and ethics. It was particularly interesting for me to hear Hagan Bayley from the University of Oxford, whose academic research I had recently reported on for the Institute of Structural and Molecular Biology at Birkbeck and UCL, wearing his entrepeneurial hat to discuss the promise and implications of the $1000 Genome. Highlights from other UK speakers included Lord Brennan QC describing his own heart attack as a preface to a talk on health policy and Peter Walton's presentation of the clinical benefits of the UK's Adult Cardiac Surgery Database.

The second session brought the focus more firmly on to Switzerland, with presentations from five companies located in the Zurich region. Four of these were biotechs: EMPA, which manufactures biodegradable "plastics" from chemicals synthesised by micro-organisms; INSPHERO develops three-dimensional cell cultures for drug testing; Xeltis is developing methods to grow heart valves and blood vessels from a patient's own cells, and Virometrix is involved in vaccine design. The fifth presentation was by a finance company, SIX Swiss Exchange.

The final session was badged as part of the UK co-organiser One Nucleus' BioWednesday series of informal discussions and networking meetings. One Nucleus is a membership organisation for biotech companies and professionals in the greater London and Cambridge areas. This "BioWednesday on a Monday" took the form of a discussion of the characteristics and benefits of a successful biotech cluster, taking Zurich and London / Cambridge as examples. The combined London / Cambridge cluster is about twice the size of the one centred on Zurich, with 167 biotech companies compared to Zurich's 87. A lively discussion led by panellists from both countries cited features of successful clusters that included quality of life (with one Swiss panellist describing his whole country as "a biotech cluster with recreational add-ons"); communication links; and a mixture of different types of company with universities and teaching hospitals. Not surprisingly, the one negative feature of the locations much cited by delegates from both countries was their high costs.

An excellent "standing dinner" with wine brought this very worthwhile meeting to its close before we all braved the Olympic traffic home.

Sunday 20 May 2012

Disguised as a librarian

It's been a long time...

What has now woken Ursula up from her long "winter torpor" (not true hibernation, it seems) was a presentation in Cambridge by Phil Bradley, President of CILIP (the Chartered Institute of Library and Information Professionals, and thus Aidan Baker's professional body). Phil describes himself as an "information specialist and Internet consultant" and I have been following him on Twitter for a couple of years now. As I have always maintained that writers and lecturers need to be expert users of the Internet (and Web 2.0 and social media) just as much as librarians, I duly disguised myself as a librarian and went along. And it was well worth my while!

The afternoon was divided into two parts: first Phil's presentation with questions and answers, and then two sets of discussion groups. I chose a session on Internet tools and one on networking, and when volunteer live bloggers were asked for I put my name forward. On the one occasion I've done this before, at the National Cancer Research Institute conference in Liverpool last year for the online oncology journal ecancer, it meant submitting two accounts of each day's presentations, one in mid-afternoon and the other generally approaching midnight. This time, I didn't even have the luxury of a few hours to put my thoughts together: I simply recorded them as the session proceeded, and the resulting words were recorded on the CILIP East of England blog, much as they had been written.

In the rest of this brief post, therefore, I will concentrate on Phil's own presentation. This was divided into two sections: first on CILIP and the role of the President, and then on the importance of social media for information professionals (a topic on which is is not only an expert but also something of an evangelist). In the first, he discussed the challenges and changes facing the library and information professions, and the way that CILIP's role has been changing with an increased focus on advocacy.

Phil started the second part of his presentation by defining social media as much wider than just "Facebook and Twitter", rather a whole new way of looking at the Internet: a transition from expert-generated content to user-generated content. People, rather than organisations, are now in control of what is posted, and organisations - or rather, the people in them - need to realise this and engage with this new world. It is certainly possible to opt out, but opting out has consequences. People are building their own authority as professionals using their social networks. He cited an example of an employee in the US who moved companies, taking his thousands of Twitter followers with him, and he managed to prove in law that those followers belonged to him, the individual, rather than his company. Some of the newest search engines will favour sites recommended by reputable people that the searcher engages with.

His final message was that the of the social Web is "chaos" and "anarchy" but that this provides an opportunity, rather than a threat, for librarians, who are some of the best placed people to act as guides. And lecturers and writers, perhaps?

Phil Bradley's website is worth a close look. It can be found here.


Sunday 16 October 2011

Of pandas and other genomes

So, it's already three weeks after the ostensible end of 23 Things Cambridge, I'm still only two-thirds of the way through the Things, and this is my first blog post for a month. I do still intend to look at a few more of the Things; for one reason or another, Prezi, data visualisation, Creative Commons and QR codes hold particular appeal. But I have always had another reason for starting this blog: if it's not too self-referential, to flag up some of my own science writing and similar activities. And where else for Ursula Small to start than with a bear - although not a particularly little one?

Since 1996 (almost pre-history in Web terms) I have had a regular column in The Biochemist, the membership magazine of the Biochemical Society. This was my first piece of regular science writing, and I will always be grateful to its then editor, Frank Burnet (now emeritus professor of science communication at the University of the West of England) for giving my fledgling second career such a flying start. In my column, The Cyberbiochemist, I aim to explain a topic or issue related in some way to computational biochemistry (or bioinformatics) in a way that is relevant and interesting to bench biochemists. Each issue has a theme, and very often, but not always, my column is related to that overall theme. In October 2011, to celebrate both the centenary of the Society and its opening an office in Beijing, that theme was "Biochemistry in China".


My contribution to this "China issue" was a survey of how bioinformatics has developed in China over the last few decades. And, like so many other technologies, that development has been meteorically upwards. The first Chinese bioinformatics research centre was set up in Beijing in 1996, the year I began my Cyberbiochemist column, but only fifteen years later the country boasts some of the fastest and most sophisticated genome sequencing technology in the world.

China does, however, still see itself as a scientific leader of the so-called Third World. It is the home of genetic sequencing projects and databases for its staple crop, rice, and the iconic giant panda. The panda's Latin name, Ailuropoda melanoleuca, means literally "black-and-white cat-foot" and  its classification within the mammals was unclear until the advent of molecular genetics. Now, however, it is known to fit in the family Ursidae: the true bears.

Chinese molecular genetics, however, has proved itself in more important areas than just in solving a controversy over the naming of bears. A consortium including many Chinese scientists was responsible for sequencing the genome of the virus that caused the outbreak of SARS (severe acute respiratory syndrome) in 2003, and just last year a Chinese-led consortium rapidly analysed the genome of the exact bacterial strain that caused a deadly outbreak of food poisoning (remember the killer cucumber stories?) If, God forbid (and as only one example), avian influenza were to mutate into a highly infectious strain, I predict that Chinese scientists will play a crucial role in analysing and eventually controlling it.

You can read the whole article here (freely accessible). I would love to post the gorgeous photo of a seven-month-old panda cub that the editors used there, but it's not Creative Commons. And that, as I've said already, is for another post and another time.




Monday 19 September 2011

Thing 15: Linked In

Oh dear. I'm going to have to stop counting Cam 23.0 in weeks - I'm already over a month behind, and at this rate it may be Christmas before I finish. I hope that no one's going to throw me out of the party on Thursday...

But I have finally come to Linked In - getting on for a month after its fellow "Week 8" Thing, Facebook. I have had an account on Linked In for well over four years - much longer than my Facebook account, and, in fact, so long that I can't remember when I joined. But I've come to blog about it, rather than skipping to a less familiar Thing (or even Extra Thing) because I have finally got round to updating my profile with such relative basics as a photo and a link to my Twitter account. Perhaps this post should have been named "Procrastination".



When the 2010 version of 23 Things reached social networking sites, Aidan mentioned my observation that, of the three communities I consider myself at least a part-time member of, biotech professionals choose Linked In but science communicators and educationalists congregate on Twitter. I still have little idea why. To my eye, Linked In seems more slick and professional than Twitter (let alone Facebook), less inconsequential, but also, perhaps, a teeny bit duller. It is strictly a site for professional business, with most of the updates appearing on a Linked In timeline being new connections and job changes. And I have used it for a small number of pieces of real professional business. At one end of the scale, a comment on a Linked In group led me to be invited to give a careers talk - not big money or even little money, only expenses and canapes, but something I find rather fun. At the other, I spotted a contract as the in-house science writer for an immunology company, and was almost appointed only to part amicably after realising that the required commitment of a day a week in Oxford was just too much for me

If anyone were to ask my advice on how to get to grips with Linked In - and looking at my profile this is still rather unlikely - I would recommend a guy called Will Kintish. He is a small, unassuming ex-accountant, and one of the best networkers and trainers I have ever come across. His workshops can be fairly expensive (it is possible to find subsidised ones, and I was fortunate enough to attend one such earlier this year) but the combination of entertainment with useful, and memorable, networking tips may be well worth a substantial investment. And he is an evangelist for the value of social networking ("not just for kids") and for Linked In in particular. Check out his Linked In website - it's well worth a visit. I would suggest that you start with the four-minute video.


But I imagine that librarianship has more in common with communication and education than with biotech, but I may be wrong. If this hunch is correct, and my hunch about the three communities is correct (and these combine into a pretty big IF) then I would guess Twitter to be many librarians' network of choice. Is this true for you?