|
Symposium on the Role of Scientific and Technical Data and Information in the Public Domain
5-6 September 2002
National Academy of Sciences Auditorium
2100 C Street NW
Washington, DC
Abstracts of Presentations
(as of September 4, 2002)
Session 1: The Role, Value, and Limits of Scientific and Technical Data and Information in the Public Domain
Discussion Framework
Jerome Reichman, Duke University Law School, and Paul Uhlir, National Research Council
Our purpose in this opening presentation is to map out in broad terms the traditional scope of scientific and technical (S&T) data and information in the public domain. Figure 1.1 provides a conceptual framework for this analysis. We divide the producers of scientific data into three sectors, namely, government agencies (primarily, but not exclusively, federal); academic and other not-for-profit research institutions; and private commercial enterprises.
The U.S. federal government itself produces the largest body of public-domain data and information used in scientific research and other endeavors, both in terms of the volume as well as in terms of the percentage of material produced. For example, the federal government alone spends more than $45 billion on its basic and applied research programs, with a significant percentage of that invested in the production of primary data sources; in higher-level processed data products, statistics, and models; and in S&T information, such as government reports, technical papers, research articles, memoranda, and other such analytical material. All of this information is exempt from statutory copyright protection.
Working scientists in academia also produce a vast amount of public-domain research data. Open data policies in this sector are directly promoted by the federal government through many of its research funding programs. Academic data sets are frequently placed in public domain repositories controlled by the government. In addition, the traditional norms of science that encourage data sharing and define fair and unfair practices in different scientific communities are largely enforced by peer pressures.
In the private sector, where the objective is to commercialize data, we point out that few and relatively weak intellectual property rights have nonetheless supported private investment in a vigorous U.S. database industry. At the same time, this intellectual property environment has facilitated the informal exchanges of individual scientific researchers, and it has not unduly impeded research at either the government or university level, except in so far as licensing contracts supported by new technological protection measures have begun to cut back to a significant degree on the pre-existing freedom of access and use.
Our final purpose in this discussion is to indicate how the role and value of public-domain data are potentially magnified many times over by the advent of digital technologies and new research tools, if the pre-existing legal regime were to remain supportive. In reality, however, the potentially enhanced role of public-domain data in science is threatened by a confluence of economic, legal, and technological pressures that are discussed in the second session of the symposium.
The Role, Value, and Limits of S&T Data and Information in the Public Domain in Society
James Boyle, Duke University Law School
Debates about the desirable limits of intellectual property are as old as the field itself. Similarly, debates about the role of public and private financing of scientific investigation have a long and
distinguished history. In striking respects, though, contemporary events have shown the limits of both the "anti-monopoly" perspective on intellectual property, and of the assumption that "public" automatically equals "free," or that "private" automatically means "controlled." The hypertrophy of intellectual property protections over the last 20 years has exacerbated these problems, while the increasing reliance of universities on patent funding and licensing income promises to destabilize an already unstable political situation; in the policy process universities traditionally played the role of “public defender for the public domain.” That role can no longer be relied on, and the traditional alternative to propertized scientific research—government funded, Cold War style science—has both practical and theoretical problems. What is to be done? The answer is a complex one. Theoretically we need a much better understanding of the role of the public domain and of the commons—two terms that are not equivalent—in innovation policy. Practically, we need a series of concrete initiatives, aimed at both public and private actors, in order to stave off the ill-effects of "the second enclosure movement."
The Role, Value, and Limits of S&T Data and Information in the Public Domain for Innovation and the Economy
Intellectual Property—When is it the Best Incentive Mechanism for S&T Data and Information?
Suzanne Scotchmer, University of California, Berkeley
Intellectual property is an important, but controversial, mechanism for encouraging investment in so-called information goods such as data or creative works, as well as in inventions related to manufactured items. However, IP is not the only mechanism, and possibly not the best one, since it leads to market distortions that might be avoided. Economists have recently revisited the question of when intellectual property can be justified as an incentive mechanism, recognizing that other mechanisms such as prizes and contract research are also possible. This line of thought is summarized in this presentation.
“Open Science” Economics and the Logic of the Public Domain in Research: A Primer
Paul David, Stanford University and All Souls College, Oxford
The progress of scientific and technological knowledge is a cumulative process, one that depends in the long-run on the rapid and widespread disclosure of new findings, so that they may be rapidly discarded if unreliable, or confirmed and brought into fruitful conjunction with other bodies of reliable knowledge. In this way open science promotes the rapid generation of further discoveries and inventions, as well as wider practical exploitation of additions to the stock of knowledge.
As a mode of generating reliable knowledge, “open science” depends upon a specific non-market reward system to solve a number of resource allocation problems that have their origins in the particular characteristics of information as an economic good. There are features of the collegiate reputational reward system—conventionally associated with open science practice in the academy and public research institutes—that create conflicts between the ostensible norms of “cooperation” and the incentives for non-cooperative, rivalrous behavior on the part of individuals and research units who race to establish “priority.” These sources of inefficiency notwithstanding, open science is properly regarded as uniquely well suited to the goal of maximizing the rate of growth of the stock of reliable knowledge. Non-cooperation in regard to the sharing of access to raw data-steams and information, and systematic under-provision the documentation and annotation required to create reliably accurate and up-to-date public database resources, will significantly impair the system’s effective functioning.
What open science communities in their pure, “ideal-type” form cannot do is support their research activities. The disclosure norms oblige practitioners to give others quick access to what they have discovered—which vitiates the possibility of extracting an economic rent from (exclusive, or restricted) possession of that knowledge. Research which is organized differently, in order to permit the exploitation of intellectual property rights in scientific and technical data an information, constitutes a knowledge production mode that is well suited to maximize the flow of economic rents from the existing body of knowledge. But, it too is not free of inefficient patterns of resource allocation: in particular, it is poorly suited to sustaining rapid accumulation of new knowledge.
The economic case for public patronage of academic research—and other, kindred forms of open science through a variety of devolved funding systems—rests mainly on the foregoing observations; and also upon the view that business firms typically are unwilling to undertake extensive investment of this sort, due to the greater uncertainties surrounding fundamental, exploratory research programmes (compared to commercially-oriented, applied R&D). It is particularly difficult to forecast when and how they would capture the economic benefits that might flow from the exploratory research inquiries. Considered at the macro-level, open science and commercially oriented R&D based upon proprietary information constitute complementary sub-systems. The public policy problem, consequently, is to keep the two sub-systems in proper balance; to prevent the long-run effectiveness of the open science sub-system, and with it the overall functionality of knowledge-driven activities in the economy and society at large, from being degraded by short-term rent capturing strategies for privatizing the existing contents of the public domain in scientific data and information.
Scientific Knowledge as a Global Public Good: Contributions to Innovation and the Economy
Dana Dalrymple, U.S. Agency for International Development
This presentation focuses on: public goods (= public domain); scientific knowledge (= information and data); and a global perspective (vs. domestic) in terms of historical development and economic and innovation dimensions. It then takes up the provision of public goods (public and private), conditions and constraints on use, and some possible downsides. The key elements are woven together in a simple graphic model.
Agricultural research is used to demonstrate these themes. It is viewed as a public good, first in the United States, and then internationally and globally. In the latter case, the Consultative Group on International Agricultural Research (CGIAR) is used as an example of the strengths and limitations of a multilateral approach of this type in scientific research.
Science and scientific knowledge have hitherto, curiously, been largely neglected in the general literature on public goods or the global public domain. They merit a more prominent place. It may also be useful for scientists to begin to view them in that broader light.
Opportunities for Commercial Exploitation of Networked S&T Public-Domain Information Resources
Rudolph Potenzone, LION Bioscience
Good science is always based on good research. But good research must be initiated on a foundation of data and information of all relevant prior work in order to produce its primary product: new data! It is in the formulation of the basic research program that information resources are so vital to every project, especially in the sciences.
The availability of good data, then, is the key to good research and ultimately good science. But where do these “good data” go once they are created, usually coming from a prior research project. If they are of particular commercial interest, such as drug information in support of a patent, they will most likely find their way into one of the huge chemistry repositories housed by the Chemical Abstracts Service (a subsidiary of the American Chemical Society), Derwent (a Thomson company), or Beilstein (from the Beilstein Institute and sold by MDL, an Elsevier company). It is obvious from these big names that data in highly commercial enterprise areas such as pharmaceuticals have little trouble being picked up from their sources (scientific journals, books, patents, etc.), abstracted, organized and maintained and then made available to the community. . . at a price. While there may be good profit in these particular data, it must be remembered that there is also a significant cost of creating these data sources.
Are data in less commercial areas less important to us as a scientific society? No, but it is the cost barriers that prevent ALL the good data from being collected, organized and offered. This leaves public funding or the work of highly motivated individuals the task of gathering these data and keeping them available.
Bioinformatics may be the antithesis of the chemical data in pharmaceuticals cited above. Here, publicly funded projects started early to create first dozens and now hundreds of protein and nucleic acid databases. While a few are managed commercially, the vast majority are still collected and made available for free with little support or income. Yet it is these very databases that are fueling the research advances in genomics, proteomics, etc. These non-commercial sources of data are vital to the scientific community and for our ability to continue to push the frontiers of science in the future.
The Role, Value, and Limits of S&T Data and Information in the Public Domain for Education and Research
The Role, Value, and Limits of S&T Data and Information in the Public Domain for Education
Bertram Bruce, University of Illinois, Champaign-Urbana
The rapid expansion of information in every discipline of study, coupled with new tools for accessing and using that information, has created multiple opportunities and challenges for educators. One response has been a resurgence of interest in inquiry-based learning, a philosophy of education articulated most effectively a century ago by John Dewey, as he saw the need for schooling to respond to a society undergoing social, economic, demographic, and technological changes reminiscent of the ones we see today. Inquiry-based learning seeks to take advantage of the opportunities offered by new sources of information, as well as to prepare students to cope with a world that is growing rapidly and changing. This talk examines Biology Workbench as a new education tool, which both affords opportunities for inquiry-based learning and demands that kind of learning from students, as well as their teachers. The workbench assembles more than 100 protein and nucleic acid databases, sequence analysis and visualization tools, research articles, and other information, and makes these available through the Web. We will consider what these new sources of information mean for education, and what the consequences of restricted access may be, both for current students and for the capacity to address future societal needs.
The Role, Value, and Limits of S&T Data and Information in the Public Domain for Earth and Environmental Sciences
Francis Bretherton, University of Wisconsin
Long-term, global data are critical for establishing facts and evolving scientific understanding of human interactions with the natural environment. Properly disseminated, such facts and understanding should provide the basis for public policy and its implementation, as well as economic benefits to a variety of specialized users. International sources are essential for much of this science and its application, as is sustaining a dependable, coherent, information system for acquiring, validating, and synthesizing the data into core products. Existing examples in varying stages of development include weather and climate observation and prediction, land cover and its changes, estimates of fish stocks, assessment of earthquake risk from petroleum survey data, and changes in carbon storage.
The analogy of a tree is presented to describe the necessary functions within such a system, including appropriate roles for the public and private sectors. Where the science surrounding core products is sufficiently mature, such products may have significant commercial application, provided the business model used is consistent with them continuing to be available in the public domain and the underlying data remaining accessible to scientific scrutiny and re-use.
Reference: National Research Council (2001). Resolving Conflicts Arising from the Privatization of Environmental Data, National Academy Press, Washington, DC.
The Role, Value, and Limits of S&T Data and Information in the Public Domain for Biomedical Research
Sherry Brandt-Rauf, Columbia University
Issues relating to the sharing of scientific data and to the movement of data between private and public domains raise critical issues for science policy and the progress of science. These issues are often explored using an atomistic approach that imposes artificial distinctions between the input and output of scientific work. By contrast, this presentation discusses and employs the concept of the data stream. The data stream perspective views data not as the specific identifiable outputs of scientific research but rather as entities that are part of a continuous and evolving flow of scientific production. This perspective incorporates a broader view of data, including information as well as a variety of scientific inputs and outputs including techniques, protocols, and materials that shape and are shaped by the process of scientific research. It also facilitates an analysis of the public and private aspects of scientific information at any given time, an equally fluid characteristic of the elements of the data stream. Understanding the ways in which portions of the data stream are disseminated—how, when, and to whom—is critical to addressing policy concerns relating to the role, value, and limits of public domain scientific and technical data and information.
Session 2: Pressures on the Public Domain
Discussion Framework
Jerome Reichman, Duke University Law School, and Paul Uhlir, National Research Council
In the opening presentation, we suggest the extent to which the vast reservoir of accumulated public-domain data feeds into the scientific research infrastructure and to the system of innovation to which it gives rise. That government-funded research plays a primary role in making this system of innovation the most productive in the world has often been pointed out. However, the indispensable role of public-domain data in nourishing this system, while generally taken for granted, is less clearly understood or appreciated, and new possibilities for further potentiating this role in the digital environment truly constitute another “endless frontier.” These matters require more attention lest growing pressures to fence the scientific commons lead to unforeseen and unintended consequences that seriously disrupt the national system of innovation as a whole.
In this session we review the many economic, legal, and technological pressures that have been placed already on public-domain scientific data. From an economic perspective, the trends to privatize the government’s public-good functions and to commercialize more of the academic sector’s research activities have been underway for over two decades. While this has certainly resulted in significant research advances and economic benefits, it is not without costs. Moreover, we see a further continuation of such privatization and commercialization of upstream public-domain resources as potentially having greater associated costs than benefits.
In addition, recent changes to copyright law and the widespread use of licensing contracts of adhesion in commerce—as well as exclusive licensing agreements on onerous terms for research tools and technologies in academia—are further diminishing the availability of public-domain data in science. These highly protectionistic legal mechanisms are increasingly enforced by more effective digital rights management technologies. Such developments are intensifying the tensions that already exist between the traditional sharing norms of science and the growing need to restrict access to and uses of data in pursuit of increased commercial opportunities. New national security restrictions on access to and dissemination of potentially sensitive research data and information are further constraining the availability of substantial amounts of material in the public domain. Finally, the recent enactment of a powerful new database protection statute in Europe and proposals for equivalent legislation in the United States might be expected to push these tensions into other areas of public research, heretofore less affected by the commercialization and privatization trend.
The Urge to Commercialize: Interactions between Public and Private Research and Development
Robert Cook-Deegan, Center for Genome Ethics, Law, and Policy, Duke University
The economic power of science and technology has become apparent since World War II, and policies have focused on encouraging economic growth through research and development (R&D). Three major stakeholders—government, private industry, and academia—have all adopted policies to foster commercial exploitation of new discoveries and new knowledge. Government has funded R&D and encouraged patenting of inventions discovered with federal funding. Universities have become much more systematic about seeking research funding and securing (and licensing) intellectual property. And private firms have proven adept at translating government and academic research into valuable goods and services. After five decades of movement toward pro-innovation policies, however, some unintended consequences are coming to light. These include confusion over when public investment is needed, conflict over who "gets credit" for public benefits, erosion of the public domain, increased costs of research attributable to intellectual property, and potential logjams in the innovation process itself due to strong intellectual property rights combining with atomization of its distribution among disparate actors. Genomics will be used as a case study in public-private mutualism, illustrating past trends and posing problems that innovation policy can expect to confront over the coming decade.
Pressures on the Public Domain from Intellectual Property Law
Justin Hughes, Cardoza Law School
Intellectual property laws produce their incentive effect—their raison d’etre—by enclosing, by granting private rights, and, in some sense, by monopolizing expressions and ideas. In other words, when intellectual property laws work as they are supposed to, they have at least some adverse effect on what we call the public domain.
Almost every area of intellectual property law has evolved rapidly in the past decade, largely in response to the increasingly networked, digitized economy, but also in response to other technological developments [biotech] and, sometimes, simply in response to forceful lobbying by corporate interests. Three areas where current trends in intellectual property seem to threaten the public domain of most concern to the scientific and research community are: (a) extra-copyright protection of databases, (b) patenting of tools and techniques critical to research activities, and (c) limits on encryption research and access to traditionally public domain materials imposed by the Digital Millennium Copyright Act.
In each of these areas, the concern is not intellectual property law by itself, but how intellectual property law in combination with private actor strategies may unduly and adversely affect the public domain. Members of the scientific and research community should aim for a more thoughtful approach, both individually and collectively, as to how to “shield” public domain materials from privatization through a variety of instruments. For example, at the policy level, the lack of effective use of U.S. government “march-in” rights needs to be considered; can this type of prophylactic mechanism—which is also visible in the EU Copyright Directive—be made more robust and applied to copyright laws, database protection, and patented research tools? Can the potential propertization of public domain materials generated by government funding be easily curtailed by robust NSF/NIH grant rules? By grant rules or commitments modeled on the open source movement? Finally, we would all be better served by a clearer, sharper understanding of the “public domain.” In order to hold the line on undue expansion of intellectual property laws, the public domain must mean something more than all the materials that have not been propertized yet.
Legal Pressures in Licensing
Susan Poulter, University of Utah Law School
Until the advent of electronic publishing and the creation of large digital databases, scientific norms required that publications include the underlying data, and those data, once published, were available for all to use, subject only to copyright and patent protection. This norm is under pressure today from the licensing practices of some electronic journals and database providers, and even in some instances from scientific authors. There are several instances, including the publication of the privately-funded human genome project, in which publication occurred with less than full and open access to the underlying data, conditions imposed by the authors. In a number of fields, database providers are assembling databases from published data and using licensing agreements that purport to restrict extraction or further use of data, effectively packaging public domain information in a form in which it can only be accessed and used through the payment of a toll. Moreover, privatization of public information has been encouraged by federal and state legislation, including the Uniform Computer Information Transactions Act, now enacted in two states. It remains to be seen whether these developments will ultimately increase rather than decrease the scientific and technical information available, or whether market forces will curtail some of the more egregious examples of restrictive licensing. If, however, such practices come into widespread use, they could significantly erode the public domain in scientific and technical information.
Legal Pressures in National Security Restrictions
David Heyman, Center for Strategic and International Studies
Between 1998 and 2000, the United States faced three major national security crises involving the potential loss of scientific and technical information. First, a high-level congressional investigation determined that China had stolen advanced missile technology from U.S. corporations as well as plans for the W-88—one of the nation’s most sophisticated nuclear weapons. Second, a scientist at one of the Department of Energy’s (DOE) premier national security laboratories was accused of giving sensitive nuclear information to China. And lastly, less than a year after the first two issues surfaced, two computer hard-drives containing classified nuclear weapons information disappeared from a DOE laboratory for over a month. These incidents spurred dramatic reforms—from both the legislative and executive branches—including the institution of numerous new security measures at DOE to protect scientific and technical information. An analysis of these changes reveals that while most were well intentioned, many reforms were misguided or misapplied and only exacerbated existing tensions, contributing to a decline in morale, and in some instances productivity.
Today, there are a number of growing efforts to protect and limit access to scientific information. These include efforts to restrict the activities of foreign nationals, limit information already in the public domain, expand the use of “sensitive unclassified information,” broaden enforcement of “deemed exports,” and impose new restrictions on fundamental research. We must understand today that heightened security is appropriate and necessary after September 11. But we should also be deliberate and learn from the Department of Energy experience, else, like DOE, we risk undermining the very security we seek and diminishing scientific programs vital to our economy and national security.
The Challenge of Digital Rights Management Technologies
Julie Cohen, Georgetown University Law School
After introducing the concept of digital rights management (DRM) and discussing the range of capabilities of DRM systems, the presentation will discuss their implications for access to and use of materials in the public domain, including scientific and technical data and information. DRM can prevent copying and distribution of public domain materials, including both uncopyrightable and copyright-expired materials. In addition, DRM systems can implement contractual restrictions, including pay-per-use restrictions that can be highly disadvantageous to scientific and technical research requiring repeated access to archives and databases. They also compromise the ability of libraries to deliver scientific and technical information to individuals, and that of individuals to use the information in their research. Although the restrictions imposed by DRM systems can be circumvented, these systems receive an additional layer of legal protection from the Digital Millennium Copyright Act (DMCA). In particular, as interpreted in several recent cases, the DMCA’s ban on circumvention devices is extremely broad. Although the DMCA includes a number of exceptions intended to safeguard scientific and technical inquiry, the exceptions impose criteria of necessity and secrecy that are fundamentally incompatible with the open exchange of scientific and technical information. The DMCA’s exceptions also are profoundly hostile to open source programming methodologies. The implementation of DRM protocols in trusted computing platform protocols threatens to hard-wire these adverse effects.
Session 3: Potential Effects of a Diminishing Public Domain
Discussion Framework
Jerome Reichman, Duke University Law School, and Paul Uhlir, National Research Council
The recent economic, legal, and technological pressures on public-domain data discussed in session two already are beginning to have noticeable negative effects in various areas of research. Based on additional current legislative proposals that would further diminish the scope of public-domain data in all sectors, we believe that this trend is likely to continue and will result in net social costs as the public-good functions of government and academic research are compromised.
In particular, new database protection regimes, such as the European Union’s Directive on the Legal Protection of Databases and equivalent legislation proposed in the U.S. Congress, represent a radical change in the economic nature and role of intellectual property rights (IPRs). Until now, the economic function of IPRs was to make markets possible where previously there existed a risk of market failure due to the public-good nature of intangible creations. In contrast, an exclusive property right in the contents of noncopyrightable databases breaks existing markets for downstream aggregates of information, which were formed around inputs of information largely available from the public domain.
We foresee negative effects from these developments in the broader information economy in the form of decreased competition and higher prices for downstream information products. More specifically, the cumulative negative synergies of the economic, legal, and technological trends, if not addressed, also are likely to have deleterious consequences for both public and private sector research including:
· Less effective domestic and international scientific cooperation, with serious impediments to the use, reuse, and transformation of factual data;
· Increased transaction costs driven by the need to enforce new legal restrictions on data obtained from different sources, the implementation of new guidelines concerning institutional acquisitions and uses of databases, and associated administrative and legal fees;
· Higher pricing of data and anti-competitive practices by entities that acquire monopoly market power, or by first entrants into niche markets; and
· Less data-intensive research and pervasive lost opportunity costs.
In this presentation, we review the potential effects of the recent trends in the context of governmental, academic, and commercial private sectors.
Potential Effects of a Diminishing Public Domain on Fundamental Research and Education
R. Stephen Berry, University of Chicago
Science thrives on the open access to information. One way this access has been attained is through putting data into the public domain. Obviously basic science, particularly in academia and government institutions, functions by maximizing the availability of information, whether new results or archived studies. This openness is not only essential for the development and diffusion of new work; it plays a key role also in the validation process that sustains the reliability of scientific information. Furthermore, open access to data for basic science is as vital for much of applied science, including proprietary work done in industry, as it is for the sustenance of basic science itself. Applications developed in the private sector rarely appear without any important input from the open literature (including data) from the openly available sources.
It is easy to predict, but difficult to quantify, how restricting open access will reduce the rate at which new science and technology appear. On the one hand, one effective way to discourage such restriction is to encourage and stimulate the growth of the scientific information base in the public domain. On the other hand, restricting the public domain and open access generally in this sector would be an extremely counterproductive measure for the possible benefit of a few, at the expense of the society as a whole. Moving in such a direction may turn out to be costly even to those who, by restricting access through privatization and commercialization, do so in the expectation of benefiting themselves.
The trend of the federal government to license key research data resources from the private sector, with many associated restrictions on access and use, is one major concern. Another is the growing trend to commercialize the fruits of academic research, including the underlying data. Both of these trends will be accelerated with the adoption of new and proposed IP laws that are likely to encourage such activities. Such outcomes are contrary to the purpose and benefits expected from the public generation of S&T data: public funds are used to support science because of the public goods that the science generates. Those goods, by definition of "public good," do not diminish in value by use. In fact, scientific data increase in value as they are used more and more. Restricting the distribution and use of such data thus can thwart the public interest in the funding of public research and associated data collection activities.
At present, the financial support for basic sciences, apart from the biomedical sciences, has diminished significantly, to the point that, for example, there is a severe conflict between the salaries actually paid to postdoctoral associates and what has been recommended by the National Research Council. The budgets of researchers and of university libraries will respond to increases in subscription fees by dropping subscriptions, not by finding more funds. Consequently it is very possible that restricting both the public domain and open access for scientific information will not only injure the scientific enterprise, but will be counterproductive even for those who wish now to impose those restrictions.
Potential Effects of a Diminishing Public Domain on Environmental Information
Peter Weiss, National Oceanic and Atmospheric Administration/National Weather Service
During the 21st century, unprecedented situations are likely to arise that could significantly challenge the way we live or cause dramatic changes in the economy. According to the National Climatic Data Center, the United States sustained 48 weather-related disasters over the past 21 years each of which has overall damages and costs of $1 billion or more, with total damages and costs exceeding $180 billion. Seven occurred during 1998 alone―the most for any year on record. Ninety percent of all Presidentially-declared disasters were weather and flood related, and 1998 was most expensive year yet for natural disaster relief.
Innovative use of weather, water, and climate information will increase our safety and productivity, improve the nation’s competitiveness, and enhance our standard of living. For example, one study found that the long-range predictions issued by the Climate Prediction Center for the 1997-1998 El Nino led California to conduct major mitigation efforts leading to a reduction in losses of about $1 billion.
Communication technologies are fostering an unprecedented growth in the use of weather, water, and climate information, particularly by the broadcast industry and the emerging Internet and wireless industries. These technologies enable citizens and industries to get weather information when and where they want it. According to the Pew Research Center for the People and the Press, weather has become the number one form of news that people looked for on line. Recent surveys have rated NOAA’s weather-rich web site the most popular U.S. government site.
The United States government’s policy of open and unrestricted weather, water, and climate information is also the underpinning of an entire and growing industry of firms that gather, package and deliver this information and services to fit the specific needs of their clients. Estimates vary, but the yearly revenue of the commercial meteorology industry probably exceeds $500 million. Since users in a number of weather sensitive industries increasingly need expert support to interpret and apply meteorological data, this private sector market segment should experience further growth. Perhaps the best recent example of a business opportunity created by increased availability of weather and climate information is the weather derivatives industry, which, since the first public offering in mid-1998, has now mushroomed to a $7 billion industry.
These benefits of the availability of NOAA’s environmental data in the public domain will be contrasted with the policies and practices in the European Union for similar environmental data, where the government agencies commercialize such information at the source and treat it as proprietary, rather than as public-domain, material.
Potential Effects of a Diminishing Public Domain on Biomedical Research Data
Stephen Hilgartner, Cornell University
What effects might changes in the contours of the public domain have on biomedical research? This talk addresses this difficult question in light of recent work in science and technology studies that examines how scientists regulate access to data, biomaterials, and other resources. This work shows that scientists manage access through a complex mixture of formal communication and informal exchange, deploying a variety of techniques to control who gets access to what data and under what terms and conditions. Unpublished data, unique biomaterials, scarce skills, and other resources that convey a competitive edge are often used to broker collaborations, obtain funding, and build research networks. Scientists also actively constitute public domains through practices that create public resources, such as biomolecular databases, materials repositories, and published literatures. In short, a dynamic economy of data exchange operates out of view of those who limit their attention to the published literature. This paper engages in the admittedly speculative exercise of imagining how changes in the boundaries of public domains might be expected to change this “invisible economy.” Three kinds of effects are considered: direct effects on specific research projects or transactions; second-order effects on the culture of research communities; and third-order effects on the position of science in the wider polity.
Session 4: Responses by the Research and Education Communities in Preserving the Public Domain and Promoting Open Access
Discussion Framework
Jerome Reichman, Duke University Law School, and Paul Uhlir, National Research Council
If the economic, legal, and technological pressures on public-domain scientific data that were discussed in the previous two sessions continue unabated, they will likely lead to a disruption of long-established scientific research practices and to the loss of new opportunities that digital networks and related technologies make possible. These pressures could elicit one of two types of responses. One is essentially reactive, in which the public scientific community adjusts as best it can without organizing a response to the increasing encroachment of a commercial ethos upon its upstream data resources. The other would require a science policy response to the challenge by formulating a strategy that would enable the scientific community to take more active control of its basic data supply and to manage the resulting research commons in ways that would preserve its public-good functions without impeding socially beneficial commercial opportunities. The idea is to reinforce and recreate, by voluntary means, a public space in which the traditional sharing ethos of science can be preserved and insulated from the commoditizing trends. Our final presentation will review some approaches that the U.S. scientific community might consider in addressing this challenge, and that could have broader applicability for scientific communities outside the United States.
Promoting Access to Private-sector Data and to Publicly-funded Databases in the Life Sciences: New Approaches
Ari Patrinos and Dan Drell, U.S. Department of Energy
The long-established tradition of science has been that academic research resulting in new discoveries is published, for the free and unrestricted use by all, in exchange for public recognition of the researcher(s) whose insights, experimental skill, and effort have achieved this benefit for humankind. The funding agencies that support the bulk of academic and Government lab research have repeatedly affirmed this as explicit policy. However, recently the role of the private sector, through its increasing funding of research as well as its own proprietary research investments, has engendered controversy focused on the issue of access to their results when published in journals such as Science and Nature. The cases of Celera’s human genome sequence (J.C. Venter et. al. (2001) “The Sequence of the Human Genome,” Science 291: 2001-1304) and Syngenta’s draft rice genome sequence (S.A. Goff et. al. (2002) “A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. japonica)” Science 296: 92-100) have underlined this controversy. This has resulted in a rancorous debate about the proper role of the public domain (e.g. Federal funding agencies) in promoting access to research funded by private firms, which all must acknowledge is the intellectual property of the company(s) that generate it. The data and other results derived from Federal funding are as accessible to these private-sector entities as to anyone in universities, the non-profit sector, or other government laboratories and this asymmetry can be interpreted as unfair. The challenge for us is therefore to make constructive suggestions for ways to move forward that might reduce the impasse and perhaps promote greater data sharing to the benefit of all.
Various suggestions have been advanced, from timers (limits on the time private sector data can be withheld in exchange for publication in prestigious journals) to “honest brokers” who will oversee rigorous peer review but hold private sector data “in trust” for a predetermined period of time, to legal codification of a research exemption that would establish in law the ability of publicly funded scientists to use private sector data and discoveries without fear of accusations of infringement and consequent legal pursuit. While all of these ideas may have limitations, we assert the need to explore these (and other conceivable options). Can incentives be defined that would induce private-sector companies to relax access restrictions on their data? If agreement can be reached, the benefits move in both directions, with academic expertise becoming more available to private sector companies and the science carried out in the private sector becoming more accessible to academic scientists, leading ideally to a “win-win” for both sectors.
Another related issue that affects the availability of much valuable data is the long-term support for, maintenance of, and curation of public electronic databases of biological information. While the best known examples are genome sequence databases, others also exist (Protein Data Base, etc.) that all involve large, expensive, and rapidly expanding resources that are fundamental supports for scientific research. From the perspective of supporting agencies, however, these databases also act as mortgages on the budget, often in direct competition with basic research. Traditional review procedures, designed to evaluate the merit of hypothesis-driven research applications, are not designed for evaluating the contribution of infrastructure projects and resources. If useful and well operated, these databases grow seemingly without limit. As biology becomes more interdisciplinary and discovery-driven from high-throughput data generation, each database needs to support interoperability to promote cross-database analyses in search of correlations from which new hypotheses are identified and can be tested in the lab. Many users of publicly supported databases are in the private sector and the same issues recur about the apparent asymmetry of the public sector’s willingness to place data in these freely accessible resources.
Academics as a Natural Haven for Open Science and Public Domain Resources: How Far Can We Stray?
Tracy Lewis, University of Florida
This presentation reveals two critical roles of open science and public domain information in present day academic and research institutions. Clearly, open disclosure of scientific information and data is required for research collaboration and for dissemination and verification of results. But, less transparently, openness in sharing data and results is also required to address the significant information asymmetry existing between the specialized researcher and the research sponsor. Only the researcher’s peers can evaluate her contribution, and only if she fully and completely disclosed of her findings. Peer review provides the basis upon which Universities monitor their researchers’ efforts, and assess who are the most talented researchers. In turn researchers’ professional reputations and recognition among their fellow scientists, and the basis of their employment and compensation is determined by peer review of their findings. Open disclosure of research findings is therefore the glue that holds academic and research institutions together. The ability for openness to survive naturally in other non-academic environments depends on how similar the setting is to an academic one, the importance of basic research to that setting, and the ingenuity of managers to provide incentives for basic and applied research to occur simultaneously.
New Legal Approaches in the Private Sector
Jonathan Zittrain, Harvard University
The academic and non-profit communities have sometimes released their intellectual fruits wholly into the public domain, and at other times have claimed full proprietary interest. The use of intellectual property protection to prevent the derivative proprietization of one's public-domain work, once the idiosyncratic obsession of a handful of academic computer operating system designers, has slowly moved into the academic mainstream. Creative Commons, for example, is a new non-profit designed to streamline the introduction of material for free uses—"free" as in speech, not beer, as Richard Stallman is fond of saying. Such approaches are legally untested and still unrefined, but offer a window into what could be a more open future for production of knowledge in the public interest. Universities and libraries could play an important role in bringing about that future, if they are willing to abandon some of their own mercenary incentives.
Designing Private-Public Transactions that Foster Innovation
Stephen Maurer, Esq.
Current academic/industry transactions rely overwhelmingly on patent rights and exclusive licenses. For publicly funded science, these arrangements may not be necessary and could have deleterious effects on later innovation. But what should replace them? I will discuss recent innovative transactions and suggest specific criteria for deciding when they constitute an improvement over the present system.
New Paradigms in Academia
The Role of the Research University in Strengthening the Intellectual Commons: The OpenCourseWare and Dspace Initiatives at MIT
Ann Wolpert, Massachusetts Institute of Technology
Research universities play a significant role in the value chain of new knowledge creation in science and technology. Natural competition among academic researchers has been balanced, historically, by shared commitments to the education of students, the mutual advancement of academic research disciplines, and the expeditious sharing of research results through the peer-reviewed publication system. By practice and philosophy, academic researchers are committed to sharing new knowledge and understanding.
The growing importance of information technology as a tool in the development and management of new knowledge in scientific and technical disciplines is increasingly well understood. However, the impact of information technology on the educational and communications aspects of higher education is still emerging. The Internet offers both opportunity and risk to traditional positions at every point in the value chain.
The speculative application of standard business models to higher education has led to suggestions that the public domain is doomed as an essential element in the education of students, the advancement of knowledge, and the dissemination of research results. Assumptions emerging from this business model include the tightening of controls by traditional publishers through expanded legal regimes and contract law. Likewise in this model, higher education is expected to generate additional revenue streams through the close management of educational materials and increased control over institutional intellectual property.
An alternative scenario to this speculative business model is emerging at academic research institutions such as MIT. These institutions seek to retain, to the degree possible, the spirit of public domain access in their approach to the management and distribution of intellectual works. MIT’s OpenCourseWare initiative is designed to make the pedagogical structure of all of MIT’s courses freely and routinely available on the world-wide Internet. The DSpace project will provide MIT faculty with a secure, dependable repository and distribution service for their digital works, both research and educational. DSpace is, additionally, intended for open source availability, and interested partners are being actively sought to federate in its future development and support. Both DSpace and OpenCourseWare expect to encourage faculty participation in the Creative Commons public domain license initiative.
The innovative and experimental nature of OpenCourseWare and DSpace has served to illuminate the barriers to implementation of a public-domain like approach to the open sharing of academic educational and research materials. Possible solutions are beginning to emerge.
Emerging Models for Maintaining Scientific Data in the Public Domain
Harlan Onsrud, University of Maine
Scientists need full and open disclosure and the ability to critique in detail the methods, data, and results of their peers. Yet scientific publications and data sets are burdened increasingly by access restrictions imposed by legislative acts, case law, and publisher practices that are detrimental to the advancement of science. As a result, scientists and legal scholars are exploring combined technological and legal workarounds that will allow scientists to continue to adhere to the mores of science without being declared as lawbreakers. This presentation reviews two separate models that might be used for preserving and expanding the public domain in scientific data. Explored are the technological and legal underpinnings of Research Index and the Public Commons for Geographic Data Project. The first project relies heavily on protections granted to Web crawlers under the U.S. Digital Millennium Copyright Act, while the second is based on legal approaches utilizing open access licenses.
New Paradigms in Industry
Open Source Software in Commerce
Bruce Perens, Hewlett Packard
This presentation will report on the astounding acceptance of Open Source software in business. Banks are some of the most conservative businesses around. Why, then, have they embraced Linux? The investment bank Credit Suisse First Boston is using Linux systems to drive billions of dollars of program stock trading per day. Multinational retail banking giant HSBC is deploying Linux on the desktop. Medical Imaging giant GE is employing it in CT scanners. How can Free Software be trusted for all of these mission-critical applications, and why do its users choose it over the proprietary alternatives? What are the complications of employing Free Software in a business? Where do computer vendors, media vendors, and Free Software come into conflict? What legislation may be necessary to support healthy competition between Free Software and proprietary software in the future?
Corporate Donations in Geophysical Data
Shirley Dutton, University of Texas at Austin
Shell Oil Company developed a new paradigm for the transfer of private geologic data to the public domain in a business model that has since been modified and followed by Altura and BP. These companies have donated more than a million boxes of valuable geologic core samples to the Bureau of Economic Geology, a research unit of The University of Texas at Austin. The companies have also provided data about the cores—state, county, operator, lease, and depth, and, in many cases, unique well number (API number). These data will be available over the Internet in a searchable database. A critical aspect of the donations is that the companies also donated the buildings used to store the cores and provided cash endowments that allow the Bureau to operate the facilities and provide public access to the data. A feature perhaps unique to geologic data is the high cost of preserving it; rocks are heavy and take up a lot of space. Curating the cores in an organized way and physically retrieving them for researchers to view and sample take space and manpower. The university would not have been able to accept these core donations without the accompanying endowments.
In the United States, the company that acquires geologic data such as core samples owns them. Many large U.S. petroleum companies have closed their research labs and are reducing or eliminating the cost of maintaining privately owned core-sample collections that they once housed for proprietary use. The business models used by Shell, Altura, and BP have worked well in transferring private data to the public domain, models that could be applied at a national level.
Date Release Issues Illustrated by Some Major Public-private Research Consortia
Michael Morgan, Wellcome Trust
The international collaboration to sequence the human genome established from the outset data release principles commonly referred to as the “Bermuda Rules.” In establishing the single nucleotide polymorphisms (SNPs) Consortium, these principles guided the intellectual property and data release policy embodied in the legal agreement establishing the consortium. This policy has been seen to be robust by both the industry partners and the Wellcome Trust and have been guiding principles in establishing new consortia. These case studies will be discussed in more detail.
|