Skip to content

“Its gunna cost ya” – who pays?

July 14, 2009

Hi I’m Seb Chan.

I’m really excited to be on the Taskforce. And you can read about what I do over at my work blog at the Powerhouse Museum in Sydney.

I’m simplifying things a bit for brevity here but one of the big issues around access to government data is that of cost.

Creating, collecting, and preserving data costs. Whilst your taxes are paying for this, many government agencies have also been asked, over the years, to generate revenue from selling data and/or access to it.

This revenue sometimes generates a profit, but other times it doesn’t even remotely get near covering the cost of selling it. The business units that sell this data rely upon clarity in terms of intellectual property in order to develop their business models and thus they’ve been encouraged to operate to protect their own interests and data (even when they might now run counter to those of the community).

Add to this the complexity of government IT systems and where there are legacy systems involved, getting the data out can incur substantial costs. At the very least there are time and resource costs. And because of prior policy decisions around outsourcing there are also now often third party fees payable to outsourced IT service providers.

So this is a tricky area to venture into and there need to be some clear ways of addressing questions around these matters.

With my team I’ve done a fair bit at the Museum around the Creative Commons licensing of all our text-based research content on our website as well as the release of a whole lot of historical images to Flickr with ‘no known Copyright restrictions’. Now we do run a business unit selling images, and we sell our text-based research through the exhibitions and publications we produce.

So how can we justify “giving it away”? Well, our mission is closely aligned with education and to better serve citizens, students and the community it makes sense to. Perhaps surprisingly, it also makes good business sense in the digital age.

One of my team at the Powerhouse, Paula Bray, recently published a paper looking at the cost implications of making these historical images available for free via Flickr. As it turns out, we are selling more now than ever before – even though ‘customers’ can get them from Flickr for free! (In fact our referrals from Flickr which result in sales are up too!)

Similarly the ABS is finding out that since they’ve made their data freely available the online usage of their data has shot up. This results in new business opportunities just as others evaporate.

The Inquiry Into Improving Access To Victorian Public Sector Information & Data covers these issues in some detail in Chapters 7 (Pricing) and 8 (Technical Infrastructure).

However all this extra usage also ends up generating additional costs. Think of all the extra enquiries, the extra requests that this extra usage and access generates. At the Powerhouse making our collection research more accessible, more usable, and  more ‘open’ we have seen the volume of public enquiries more than triple and we struggle with answering all the enquiries we now get from all over the world. For us, this is an exciting opportunity – but it is also a huge challenge.

So how are we, as a society, going to pay for all this data – not just the collection and storage, but access and increased usage?

Advertisement
18 Comments leave one →
  1. Chris permalink
    July 14, 2009 10:16 am

    Seb, you provide a good insight into the challenges faced by government departments to open up access to data.

    I would like to take aim at one point however. You say “However all this extra usage also ends up generating additional costs. Think of all the extra enquiries, the extra requests that this extra usage and access generates”.

    Google, Microsoft, Yahoo (owners of Flickr) and many other players in the search space are competing like mad to make available as much information as possible and do this completely free of charge. Your example of using Flickr is a good case in point. Flickr make available lots of tools and API’s to make it easy to upload images because they want to become the definitive image resource on the internet.

    If the government established a cohesive approach to making as much data available they could then approach these search companies who would LOVE to make this information available free of charge — costing the government nothing in servers, site admins, site software etc.

    For example
    – Google is paying to scan every book in libraries across America
    – Wolfram Alpha wants to make as much data available about pretty much everything. They will even pay for it!

    Also, as a taxpayer I also take offence at having to pay for data collected by the government — but that’s another story.

    • Paul Rowe permalink
      July 17, 2009 2:54 pm

      The rush by Google, Microsoft, Yahoo and Flickr to make as much information as possible freely available is not without its own agenda. All of these examples in turn display ads against the ‘free’ content to generate significant income. Google information interfaces, including Google Books, include ads which cover the costs of creating and distributing that content. For Google this was $US 21 billion in 2008:
      http://investor.google.com/fin_data.html

      Can a government or public institution recoup costs through ads? I think there would be less tolerance for this amongst users than on corporate information search sites.

      • July 17, 2009 3:59 pm

        Paul this is one of the reasons why the “distribution” of information from government sites should be via third parties as well as directly from the government sites. My guess is that most people would object to the government advertising anything other than its own programs. That is, like the ABC it can advertise its own complementary services but it should not be advertising commercial services.

        However, this does not apply to “resellers” and repackaging of government information. A good model is a royalty model where the government gets a percentage of the income generated by the resellers. It is difficult to set a price for most information but you can set a price for reliability, authenticity, access, ease of use etc. Also it is not necessary to set a price for all users as it simply results in “churn”. For example, it is not worth collecting money from another government agency for its use of the data nor is it necessary to charge charitable or government supported agencies.

        There are other situations where there is a quid pro quo that need not involve money. For example a government organisation may hold some information and allow users to correct and add to the information in return for using the information. In other words the wiki concept can be used not only for access but for collection and gathering of information.

        Money collected is not the only way to evaluate the success or otherwise of information dispersal. Other methods are accesses, reuses of information, money collected by reusers, reduction in costs and time from use of data. All of these can be collected or estimated if they are considered when designing the access systems. They can then “automatically” supply evaluations in real time.

        Probably most government information can be collected automatically and distributed automatically if we design the system appropriately. For example, I am about to pay my rates and taxes. As these get paid I see no reason why I could not be told how many others have paid and the percentage still to go. I would also be interested in seeing the total of bad debts per suburb. When I go to pay my water and sewerage bill I would like to see the average consumption of water per person for each suburb and the percentages paying and not paying not only in my city but across the nation. The reason I would like to see this is to be assured that others are doing their bit. Making such information available might seem petty and trivial but the fact that it is there and we can verify it means that we build social capital. It also means that discrepancies in delivery of services across the nation becomes publicly visible.

        In other words instead of having to wait for someone to do a survey or conduct a study we can with modern technology monitor society in “real time” and not only see problem areas but see how we go at fixing those problems as we introduce programs to fix difficulties.

        One of the keys to building efficient organisations is to make routine good practices and to highlight discrepancies in outputs when they become apparent.

        For example there is no reason with electronic cash registers and inventory systems available in all stores for my purchases to be automatically sent to me and for me to then give those items to a government or commercial agency who can then publish aggregate figures in real time of actual purchases made of different items across the country.

        There is no reason why I couldn’t send the results of my latest blood tests automatically and anonymously but with personal characteristics to the Bureau of Stats and they to allow me to see how I compare “in real time” to my cohorts.

        What I am trying to say here is that we should not just think about what we are doing at the moment but we should take the opportunity of imagining what can happen when there is a free flow of information and think of ways of making the collecting of information part of other activities that we do routinely.

        Think how this could work for income tax. Instead of filling out a tax return once a year why can’t I send all the relevant bits of information to the tax office (or some private agency) everytime a transaction involving tax occurs and the tax is automatically calculated and paid. This would save my employer having to take the tax out of my salary and then me having to reconcile everything at the end of the year. It would also stop things like the overpayment of dependent benefits etc. It is my guess that allowing tax payers to optin to such a system would be very popular. As a side benefit each tax payer (and the government) could see in real time how the rest of the country was going in terms of their paying their taxes and get very good statistics on employment in “real time”.

        The message is that making information flow freely in both directions will lead to interesting outcomes that will almost always work towards more efficient operation of society.

  2. ben rogers permalink
    July 14, 2009 11:11 am

    Seb,

    great post – the making available of public data also has major impact for the data collection processes of govt too. Lots of data is sold to Govt with very restrictive licensing arrangements – meaning that even if it wanted to, the Govt would not be able to provide citizens with access to the information. To avoid this Govt would have to purchase datasets with more liberal licensing arrangements, which can incur a much greater cost. Perhaps the flow on effects would more then compensate Govt and vendors in the long run, but the initial hurdles would be high I would imagine. Some mechanisms around this are whole of govt purchasing arrangements – but this would require state and fed govt’s coming together to negotiate with Vendors for bulk rates on data – something people are working on but is not there yet.

    Ben

  3. ben rogers permalink
    July 14, 2009 11:12 am

    forgot to set email notifications :)

  4. July 14, 2009 5:25 pm

    Excellent point and one that many advocates of open government data should bear in mind. I was wondering if you have any data concerning costs and benefits for organizations that are not supposed to sell material, i.e. using Flickr as an alternative to entire portions of their own web site.

  5. July 14, 2009 6:51 pm

    It should be a simple matter to get data out to the public … but I can’t apply that statement to any agency without understanding all aspects of the organisation’s operations.

    A good point is made: agencies are encouraged to generate revenue to contribute toward costs, and Seb makes a further point that sometimes the revenue doesn’t come close to covering the costs.

    It’s at this point that one needs to consider the corporate objectives and delivery strategy of the organisation. In my Public Service Board and Department of Finance days (I left about 20 years ago) we used to use a few techniques like these:

    > define the purpose and objectives of the organisation
    > relate the organisation’s program structure to the delivery strategy
    > define the performance criteria of the programs
    > set up some charts with a performance criterion on the x axis (with most value on the right end of the scale) and cost of delivery on the y axis (with most cost on the top of the vertical scale)
    > locate each program’s x,y on the chart.

    This then allows one to identify programs that have most contribution to the organisation’s objectives (being clustered at the right end of the x axis) and their relative costs (with the preferred location being the bottom right hand corner of the chart).

    So how does this help Seb’s problem? The solution we used to propose was to fund highly relevant/performance, efficient programs at the expense of programs that were less relevant/lower performing programs.

    If one looks at the internet as a service delivery mechanism, then criteria such as % of the population reached, quality (richness) of information/experience delivered and low cost of an efficient internet delivery should locate an efficient internet-based delivery of information a prized program.

    Even so, there is still a service delivery cost – should it be free to the public? I think that should be answered at the whole of agency level: what are you expected to achieve, and how much is the government prepared to provide in funding to achieve that outcome? Thinking really hard about that, one might conclude that a significant proportion of the agency’s budget should be used to fund free delivery of information.

    So we come to the “user-pays” argument. I don’t have much time for this argument in circumstances where it is not universally applied. There are many services I pay for in my tax but I don’t use. How is it that some services attract “user pays” and others don’t? Why is any information free?

    Consider the ABR: a really good free service with a good service delivery infrastructure (web services). There’s an objective – get people to use the ABN. To do that, make ABN data free and available. And that works!

    Now consider information on public registers: ASIC, Bankruptcy etc. You have to pay a service company to use this information.

    Consider PSMA, Census and ABS data … collected for the public good, but locked away behind a “user pays” mechanism.

    OK … I confess, I’m out of date on a lot of this, but I think the principles still apply. If the taxpayer funds a program to provide information (or education) for the public good, then get on and “put it out there” as efficiently as possible, with a focus on reach and equality of access, and usefulness (provision of tools for use).

    The ABR is a really good example of getting this right.

  6. July 14, 2009 9:41 pm

    The Victorian Inquiry Into Improving Access To Victorian Public Sector Information leads the way on ideas, policy and strategy. The Victorian government has six months to respond to the recommendations. Although I’ve seen no announcements on who will lead the Victorian government response, Randall Straw (Deputy Secretary, Innovation and Technology) for the Department of Innovation, Industry and Regional DIIRD is likely candidate.

    For the vast majority the significance of this report is not yet broadly understood, but all of the recommendations are well grounded in ideas have been been extensively written about, researched, tested, applied overseas, dissected, brewed and converging for some years.

    I welcome the majority of these recommendations with enthusiasm. I have ten years in the emergency services, and am no stranger to the struggle of obtaining data and information, even from within other government departments or agencies. You’d think data custodianship was leprosy based on the way government departments and authorities seem to have shied away from taking responsibility of their own data. The discoverability, access and exchange of data is ever hampered by resource-wasting and ultimately demoralizing furphys, failures and fudging around licensing, costs, privacy, lack of standards, lack of interoperability of systems, proprietary software, and lack of metadata.

    We’d all rather be spending our time value adding, making a difference, doing what we do best. We have some really big problems to solve in this country: bushfires, water, salinity, housing shortages, health care, biosecurity, education, to name a few.

    The recent tragedy of the bush-fires should give us all pause, and also the courage to say that we will no longer accept lame excuses, naysayers and heel-draggers. As the report aptly lays out, all of the issues around privacy, licencing, funding and standards are indeed surmountable, if only government has the will to do so. If government does not lead the charge, then at least clear the highways and step out of the way. The time has come to knock down all barriers to the access of any and all information necessary to unleash the potential that we know is there.

  7. July 15, 2009 7:49 am

    A good post and one that shows that “free” is a good marketing tool.

    As others have pointed out the cost of access is small in comparison to the cost of acquisition. If the powerhouse is collecting the information then making it available at the cost of access is what should happen and should be the rule for all government information.

    If we attempt to get people to pay for the cost of acquisition – using the so called user pays principle – then we will get little access. I think most people will pay for the cost of access but not the cost of acquisition or of storage.

    However, once electronic access is put in the cost is essentially zero so it is not worth charging for it.

    So let us stop even thinking about charging for access and not get into the trap of trying to cover the cost of acquisition and storage through access charges.

    If the information is of benefit to the community then they will access it. If that access gives a monetary return to some in the community then you could well find that those benefiting can and will contribute to the cost of maintenance. A model for this can be found in the Intellectual Property regimes of some US universities such as Stanford where they do not charge for access to IP but if you make money out of the access then you are expected to pay something back to the University in terms of royalties to further the development of IP in the University.

    Models can be established so that money can be given to government agencies from people who use the information and make a profit from that use. After the profit is made then they can contribute back to increasing the information.

    This strategy is an “investment” financing strategy as opposed to a “rent seeking” strategy and could be applied widely throughout government agencies. The critical part is that any payments back to agencies goes to agencies and for the specific purpose of increasing the quality and quantity of data and not go into “consolidated” revenue to be spent on anything.

  8. Geoff Barker permalink
    July 15, 2009 11:22 am

    ‘who’s cuisine reigns supreme’

    I agree with Kevin and Seb that free acqusistion and access models
    offer great opportunities for reassessing the role of the distributed collections and would like to make two obsevations.

    1- From a museum curator’s perspective I think its important to build models that carefully appraise where and at what stage of cataloguing information is distributed. The delivery of content-objects (like photographs)into spaces like flickr, facebook it’s own or others websites,and government distribution hubs like Picture Australia, pushes content into spheres that serve different purposes and generate different kinds of new information (mouseover comments, emails, letters,phone calls, twitter feeds) about these objects.

    In a museum context content-objetcts are linked to its responsibilities as a public institution to manage and present the best possible information relating to the objects. Placement of content-objects in different locations, as we have seen with flickr, increases our potential avenues for adding to that knowledge. But as pointed out by Seb the question here is how do we model in the cost of re-acquiring this knowledge into the museums datatbase to ensure its preservation? I also wonder where these new directions place past endeavours like Picture Australia?

    Perhaps the museum needs to take a different approach altogether to the knowledge systems it creates around objects?

    Certainly careful project managment to prepare groups of objects for delivery to a set minimum standard (agreed on and clearly outlined in policy documents), would appear to be good start in curbing any unsustainable public expectations and provide a more cost efficient mechanism than attempts to reharvest from a multitude of sources over an unspecified period of time after delivery?

    2 – When the same or similar content-objects (I am thinking here specifically of photographs) are delivered from across a number of government sites repeated objects and inconsistancies between the data appear. Could a start be made in adressing this by setting (state.federal)priority listings for organisations to share with each other before they begin new digitisation or delivery projects. This could create a great opportunity for the upgrading of information across institutions and ensure consistant labelling of content(see http://tiny.cc/asRPu). It would also, if there is no cost, and no competition involved, help the user to select the highest quality version of the content available.

    I just have to add that I wonder if success measures are tied to hits and access to web 2.0 distributed collections will institutions be happy to not put material online if another organisation has already put it online? Is data quality an issue here?

  9. July 15, 2009 9:55 pm

    Some great feedback here. Thanks.

    Responding to a few questions:

    @chris: my point wasn’t so much about the storage and delivery, third parties can share that load, but about what happens when the demand for government services increases as a result, or more problematic, the type of demand changes. An equivalent argument is heard in the discussions around private health care, private schooling etc – “if everyone used the public system it would collapse”. Don’t under-estimate the effect of making government data more accessible and more usable. Personally I think this is a “good problem” to have.

    @ben rogers: there’s a whole lot of responses around this in Brian Fitzgerald’s recent post which are fascinating reading.

    @andre di maio: there’s alot of literature especially in the non-profit tech (#nptech) community around the use of Flickr as a means to undertake reasonably complex projects at very low cost. One early adopter university museum in the USA used Flickr as the backend for their online collection.

  10. Mark Hatcher permalink
    July 16, 2009 8:41 am

    Seb,

    Thanks for your post.

    While it may be tricky to calculate the (tangible) incremental costs in regards to producing the additional service. It is perhaps more difficult to calculate the benefits to the community of providing your information.

    I think that this is real challenge for advocates of more accessible government information.

    I am not sure that whether using narrow financial terms like “profit” in regards to the provision of government services is consistent with their reason for existing.

    Regards,

    Mark

  11. Fiona Cameron permalink
    July 16, 2009 11:00 am

    Hi Seb,
    Not sure who to direct this query to, so you are getting it because you are a member of the taskforce and have the latest blog entry up! I know this is a very old school sort of question, but here at the Centre for Policy Development we are working on a submission to the Taskforce, and would like to know the timeframes. Has the Taskforce got a deadline for more formal contributions, as opposed to blog comments?
    cheers
    Fiona
    (PS I don’t think I’m a total dinosaur, but I couldn’t find any “contact us” alternatives on this site)

  12. July 16, 2009 4:33 pm

    Thx Fiona,

    I’ve sent this onto the Secretariat and I expect we can get you a reasonable answer within a few days. We’re finalising the issues paper and have left the deadline for written submissions till the end of that process. My thinking is that we should give you guys four weeks, and if that’s too little time, you already have a very abbreviated issues paper – it’s call the Terms of Reference.

  13. July 16, 2009 5:23 pm

    In Victoria the emergency services purchase government spatial data from the Department of Sustainabilty and Environment under a licence (those within the DSE do not pay). This is the system that we’ve been stuck with and hated for many years, under a government ‘cost-recovery’ policy.

    It’s a situation that members of the emergency services have fought for years, until we grew tired of fighting and learned to live with it. We take solace in the knowledge that these funds do go into maintenance of the core state data.

    However, it is my opinion that time spent negotiatiating licence and financial arrangements between government organisations does nothing to enhance public safety outcomes. It can also create conflict and undermine productive inter-departmental relationships.

    If data is important to a government department’s core business, then it should be funded and maintained to a high standard by government. That is the bottom line.

    Having already been paid for once by the taxpayer, the data belongs to the taxpayer and should be made available at minimal cost of delivery.

    Sadly, cost is not the only obstacle. Too much data remains locked up within government departments who for a variety of reasons refuse to release it.

    I would hate to think that public safety ever had to be compromised because an emergency service in Australia could not access a data set existing within another part of government.

  14. July 17, 2009 5:21 pm

    There are many very relevant and valid comments on this topic. However, I also feel that there are some simplistic views being expressed. I beleive that the majority of people feel that government information/data should be made available to the public at no or very low costs. The issue this raises is that for many years agencies have been managing and maintaining their data sets using revenue genrated from sales – eg DSE in Victoria. A whole new funding model is required to ensure that if the revenue stream is removed, then funding is available to maintain the data set.

    Yvonne makes a valid point about emergecny organisations having to pay for data that may be life critical. The other side of the coin is that if the data is given away, and there is no chnage to the funding model, then the quality of the data will suffer and this will also be potentially very dangerous.

    Other comments relate to ‘just make the data available’, and while the objective is supported there are other issues to consider if we wish to ensure that the information is going to provide real value to the community. One of the first issues is discovery of the informaiton – it may be available but not easily located, especially by people not part of the community of interest that the data is relevant to – secondary level users may not know what agency holds the data. once fouind, the data needs to be accessible and in the right form to be of use. In the past many geographic informaiton systems had proprietory formats and while there is an improvement, it is not always possible to read or use data from a web site due to format and other issues.

    To establish mechanisms to improve access to PSI a number of processes must be put in place. these include:
    identification of the data custodian and the authoritative source
    metadata to describe the data or information of interest to determine fitness for use.
    new funding models to ensure continuance of data collection and data management of significant data sets.
    Appropriate policies and guidelines in relaiton to genuine national security and privacy type issues
    determinationof relevant standards for data discovery and access and also to enhance the usability of the data.

    As I have mentioned in previous discussions, the government spatial commuinity has addressed (mostly sucessfully) many of these issues and has a governance and administrative framework, a policy, tools and other capabilities to make data easily available. We should build on this existing experience and capability.

    Finally, most agencies are not funded to make all their data available via their web site. They cannot for security reasons enable the web site to connect to their backend databases.

    All these issues can be resolved, but this will require a fundamental shift in relation to funding and a strong policy framework to ensure that the philosphical shift is adhered to.

    • July 17, 2009 7:28 pm

      Ben,

      Although you might think some suggestions on funding are simplistic the truth is that the best business models are often the result of “accidents” and the best ones are simple in idea. To give an example look at Google. The inventors of Google did not think of their revenue model until the system was operating for some period of time and then it was an idea from a new employee. This is now generating revenue of $22billion per year all from giving away something for free.

      Governments are not in the business of trying to generate revenue from the provision of information and government systems are not tuned to the realities of trying to sell things. Their main focus is on compliance and on making sure that the systems fit with the legislative requirements.

      Opening up of data through commercial channels as well as directly to the public is so important. If information has to be kept for a government reasons then and any income derived from the use of the data is a secondary benefit and should only be used to increase the quality and accessibility of the data. That is, governments should not get into the business of collecting data to sell. Governments should be in the business of collecting data for government purposes and if by chance it sells then the money should ONLY be used to enhance the collection and distribution of data and not seen as a source of income.

      To give you an example. The collection of ASIC data and the distribution of ASIC information is very inefficient. It is a case where the information collection and distribution of data is used as the mechanism to fund ASIC activities. This has lead to all the problems you always get when you allow a monopoly supplier to control the market as anyone who tries to use the system will soon discover. In contrast the ABN numbering system has been an incredible success, it has cost a fraction of ASIC and it is free.

      As soon as governments start to think of data collection and distribution as a substitute for taxation and a money earning activity the objectives of Government 2.0 will be lost.

      What I have been suggesting is that government concern itself with what government has to do but permit others to commercialise data. If the commercialisation of data creates an income stream then the commercial users will pay money to the agency to enhance and maintain the services but that is a secondary and minor part of the activity. Commercialisation should not be permitted to hijack the benefits that government 2.0 will produce in ways that will often be unexpected.

      So to summarise.

      Electronic access to data is not expensive and hence it is not worth charging for it. Wherever possible electronic access should be made available to all government data.

      Collection and storage of data can be expensive but if it is not required for government purposes it should not be collected. Governments are not in the business of collecting and selling data.

      If the use of data results in a profit to someone then part of the profit should be returned to the government but only to increase the quality and accessibility of the data from which the profit was derived.

  15. July 17, 2009 6:38 pm

    I agree whole-heartedly that data quality is paramount and must not be compromised.

    If there were ever to be a shift away from cost recovery/revenue generated from sales – eg the spatial data products maintained by DSE in Victoria – then I would fight against it tooth and nail UNLESS there was a funding model that provided revenue stream to maintain the data, with a secured, and sustainable funding.

    That means that funding must NOT be subject to Departmental budget cuts or reorganisations.

    On the other hand, if revenue generation drives the data, then there is a real risk that commercial priorities can supersede those of emergency services, or of any other government user for that matter. If a data model change or enhancement is needed to accomodate emergency services requirements, but that change is not of widespread interest to other commercial users, then conficts of interest and motivation may arise. It all becomes rather unstable and uncertain: is the data a commercially driven product or is it a public good. How can it be both? Thus, even a revenue generating model must have safeguards to ensure that the data realizes its maximum value for public interest and not for revenue generation.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.