Tiny House
More About The National Coalition for Dialogue & Deliberation • Join Now!
Community News

Let’s talk about… Metrics

Metrics, measurement, assessment, evaluation — in the past ten years I’ve mostly heard these terms talked about in our field as things we “don’t know enough about,” or “don’t have the right tools to do well.”  However, as I write this, NCDD has indexed exactly 43 “assessment tools” in our online Resource Center — surveys, questionnaires, guidebooks, essays and more.

Some of my favorites:

  • A Manager’s Guide to Evaluating Citizen Participation (Tina Nabatchi, 2012)
  • A Comprehensive Approach to Evaluating Deliberative Public Engagement (John Gastil, 2008)
  • The National Civic League’s Civic Index (1999)
  • Evaluation: How Are Things Going? from Everyday Democracy (a few years old, but still a gem)

In addition to those tools, we have tagged 133 resources as having a strong “assessment” component, like the awesome 2011 toolkit from Involve titled “Making the Case for Public Engagement.”

Yet we have a lot to do in our field around metrics and evaluation, and little agreement on techniques and criteria.

Next Wednesday (on June 19th, which happens to be my birthday), I’m heading to DC for a small but important meeting convened by Tina Nabatchi of the Maxwell School of Citizenship and Public Affairs at Syracuse.  A group of open gov and public engagement leaders will be meeting to discuss what metrics for public participation should be included in the second U.S. National Action Plan for open government.

I offered to attempt to crowdsource the NCDD community’s input on metrics and evaluation, to ensure that our community’s broader-based knowledge on this topic can be tapped into during next week’s meeting in DC — and I hope you’re game!

Over the next week, I’ll be prompting you with some questions about evaluation and metrics on the NCDD Discussion list, here on the blog, and in our Facebook group.  Though this may not be the sexiest topic ever, I think we all agree that it’s a critically important topic for both practitioners and scholars to have a handle on.  I plan to both review some things that have been learned and shared over the years, and engage you in a discussion about what you do to evaluate your dialogue and deliberation efforts currently and what kinds of measurement tools might be most useful for your work going forward.

I hope you’ll all be good sports (perhaps as a birthday present to me??) and engage in this project over the course of the week!

Cartoon from Baloo’s Political Cartoon Blog at www.balooscartoonblog.blogspot.com! (Shared with permission)

Sandy Heierbacher on FacebookSandy Heierbacher on LinkedinSandy Heierbacher on Twitter
Sandy Heierbacher
Sandy Heierbacher co-founded the National Coalition for Dialogue & Deliberation (NCDD) with Andy Fluke in 2002, with the 60 volunteers and 50 organizations who worked together to plan NCDD’s first national conference. She served as NCDD's Executive Director between 2002 and 2018. Click here for a list of articles and resources authored by Sandy.

  More Posts  

Join In!

We always encourage a lively exchange of ideas, whether online or off. Questions? Please feel free to contact us directly.

  1. In A Managers Guide to Public Participation, Tina Nabatchi notes that there is a growing demand and desire for more and better evaluation of citizen participation, but that satisfying that need is challenging for a host of reasons:

    1. There are no comprehensive frameworks for analysis.
    2. There are no agreed-upon evaluation methods, and few reliable measurement tools.
    3. There are no widely held criteria for judging the success and failure of participation efforts.
    4. There is tremendous variety in the design and goals of participatory processes.
    5. Various audiences are likely to want different evaluation information.
    6. Evaluation can be daunting and resource intensive.

    What’s your take on these challenges? How do you think they could be overcome?

  2. Also, what tools, surveys, etc. do you use to evaluate your dialogue and deliberation programs? Did you create something yourself, or use or adapt something created by someone else?

    • Interactivity Foundation keeps track of participant demographics and uses upwards of a dozen measures of the quality of our process and its impact on participants.

      The same items are assessed in both a qualitative and quantitative way, while feedback on individual items comes from both facilitators and participants. Confidence in our measures results from comparing the different categories of responses.

      A couple of very brief additional comments on the general topic of assessment:

      1. One measures what one cares about. As a result, different process and outcome goals will require different measures. For this reason, there will inevitably be as healthy a variety of measures in use as there is variety in discussion processes and ends. For example, we’re especially concerned with the kinds of learning that result from exploratory policy discussion. More decision-oriented processes will of course look at decisions. That said, similar ends should be measured–if they can be– using state of the art techniques.

      2. Still, it’s more important to be clear about what one wants to know than to be technically proficient about how one measures it.

      3. Good measures should be embedded in a conceptual framework that explains why the measure itself is important. (Our measures are tied together theoretically in chapter 4 of LET’S TALK POLITICS, our new book, which reports on 250 of our public discussions.)

      4. Finally: a caution against measurement overkill. A 1:1 map of the world doesn’t help you get from point A to point B. (It’s also expensive to draw.)

  3. Daniel Clark says:

    At AmericaSpeaks, we have tended to consider evaluation in three categories: 1) impact on the individual participants, 2) impact on policy makers and other key stakeholders, and 3) impact on policy.

    The first, impact on participants, I think is pretty straight forward and a lot of good work has been done in this area. This can usually be measured through surveys and controlled experiments of different sorts. Dialogue and deliberation tends to increase understanding of issues, increases appreciation for different perspectives, and in some cases increases feelings of political efficacy.

    The second, impact on policy makers, is a little less clear. It can also be measured through surveys, but I don’t know that as much has been done or that the results are all that clear.

    The third, and in the case of the federal government I believe, most important, impact on policy, is the most difficult.

    If you look at it from the perspective of the policy making community, they usually have a desired outcome. For example, the EPA wants to strengthen environmental regulations. HUD wants to secure more funding for community development. HHS wants a successful implementation of the Affordable Care Act. In these cases, you need to show that the citizen participation somehow helped move the desired policy changes forward, either by building public support, coming up with viable policy outcomes, creating political will, or at least minimizing resistance. And you need to show that it achieved this end at a lower cost or higher return than other means, such as a more straight communications campaign, or that it added real value to a larger communications campaign.

    Given the complexity of our political process, it can be difficult to assign successful impact to any particular activity. And it is easy for an otherwise successful effort to get derailed because of other issues.

    And if you want to go beyond the immediate goal of moving policy forward to further impact, you have to assess whether or not the policy had the desired impact, which makes full evaluation even less reasonable. One might want to assess it from the citizen’s perspective, in terms of whether or not the policy outcome matched citizen’s desires, but that can also be complicated, because citizen opinion is evolving and affected by dialogue and deliberation and other communications efforts, and so you don’t have a stable yard stick to measure against.

    For me, the jury is still out on this 3rd measure, policy impact. In fact, we have very little real data to go on. I am sure there are plenty of anecdotal examples from all over the country, and they have value, but it is hard to make them part of a “metrics” discussion with the White House and other federal agencies.

    Daniel Clark

    • Daniel, thanks for sharing. In your evaluation efforts are you tracking number of participants, frequency of engagement, growth of participation over time, etc? Are you also mapping locations of participants or doing any kind of network analysis to determine the different backgrounds of participants? If you do, can you point to a place on your web site where you show this info and discuss the process?

      What sort of budget do you have to support your evaluation efforts? It seems that the process from first reaching out to engage people to the point where you might be looking at impact on policy could be fairly long. Keeping evaluation process funded from beginning to end might be expensive.

      • Daniel Clark says:

        Daniel, thanks for your questions. We always track the number of participants and their demographics. We pride ourselves on doing a pretty good job recruiting a set of participants that matches the demographics of the relevant community. Most AmericaSpeaks events are one time events or short-term projects, so we don’t have a lot on frequency or growth.

        For one project, this study here looks at demographics and responses from our participants and compares them to results from a random digital dial and other sources: http://usabudgetdiscussion.org/wp-content/uploads/2010/12/OBOEResearcherReport_Final.pdf. This evaluation effort was pretty expensive, over $100K, and funded independently of the project by one of the foundations that was also helping to fund the project. Good evaluation can be expensive (but not always that expensive), so like anything, you need to consider the benefits and costs of doing vs. using the resources in a different way. I would like to see more resources directed towards measuring impact on policy, and less on impact on individuals.

        Since you mention numbers of participants, I will add that I think number of participants is an important metric for the White House, federal agencies, and many others. It does not demonstrate impact, but it does demonstrate reach, which in some cases is a prerequisite to impact. Unfortunately, most dialogue and deliberation efforts do not scale up to large numbers very easily. AmericaSpeaks has been known for large scale, but it has been expensive. Furthermore, with the now long since past advent of the internet, large scale for lots of people is thought of in terms of hundreds of thousands and millions of people.

    • Thank you for this thoughtful, detailed response, Daniel! I’ll consider it a birthday present for me. 🙂

      Your 3 categories made me think of something I wanted to share here that I think is a helpful frame for goal-setting as well as assessment. The “Goals of Dialogue & Deliberation” graphic I was inspired to create after reading Martin Carcasson’s article “Beginning with the End in Mind” outlines 3 tiers of goals…

      1. The individual and relationship-focused goals you mentioned first
      2. The results-oriented goals we’re always talking about in our field (policy change, collective action, and concrete conflict transformation) — which is similar to the 3rd goal you mentioned, and
      3. The big-picture goal of building civic capacity in communities, or strengthening communities’ capacity to solve their own problems over the long term.

      You didn’t mention this third type of goal, but I know this is also something you think about a lot at AmericaSpeaks. I think it’s even tougher to measure than policy change, since it usually happens over a longer time period and is effected by many different factors and programs. But I think we can measure some things in this area, such as how many new facilitators have been trained, how many people are now aware of quality public engagement techniques, whether new spaces (online and face-to-face) have been established for citizens to gather, etc.

    • Daniel’s points from America Speaks are right on target! With respect to policy impact specifically related to the 2010 ACA, take a look at the 2004 NIF report on “Examining Healthcare: What is the Public’s Prescription? Results from Citizen Forums”. A key finding from that national report was the request from forum participants for the federal government to create an “Ombudsman to Help People Navigate the System”.

      A key component of the current law is the grant competition now available for organizations to help citizens navigate the new healthcare marketplaces. In other words, the forums back then accurately reflected the need for a desired solution which was eventually implemented in the reform legislation.

      Another example can be found in the 2006 NIF report “Public Thinking About the New Challenges of American Immigration” As we approach the culmination of yet another legislative reform effort to fix another broken system, you can again see “the boundaries of political permission” which show officials the course of action participants are willing to take along with the tradeoffs that are acceptable after weighing costs and consequences of different approaches. Again, many of the desired actions created in the community discussions on immigration are reflected in the current legislation.

      With the current concerns about NSA data collection and the Patriot Act, officials take a look at the results from NIF forums on “Terrorism: What Should We Do Now?”. Skipping the prescient questions participants raised at the time about the war with Iraq as well as the overall economic costs of the war on terror, there are important insights about civil liberties and increased surveillance. Such data would surely be of interest and use to public officials.

  4. Margaret Holt says:


    Is the objective with these measures/evaluations what John Cavanaugh has stated? – to demonstrate “that public deliberations achieve maximum impact on public policy?” Who are the intended recipients of the evaluation reports? What have they said about what the want to know in their roles as policymakers and shapers? What do you know about their current utilization of measures/evaluations? I am certain that many of the tools in your resource center are superior. The challenge often is finding out what types of reports will grab the attention of those in positions to make and influence public policy. I personally believe that quite often superior reports are prepared regarding the public’s thinking about policies, and they are disseminated but never utilized. This is to say, that it might be as useful to think about what strategies are employed to bring the public’s perspectives to the attention of the policymakers. Dissemination does not equal utilization. Happy Birthday in advance. Margaret Holt

    • These are all great questions, Margaret! The broad goal of the meeting we’re having next week is to identify best practices for public participation in government (especially federal agencies) and suggest metrics that will allow agencies to assess their progress toward the goal of becoming more participatory. Despite open gov efforts and guidance so far, most federal agencies’ open gov plans have failed to include standards for what constitutes high-quality public participation.

      Though we’ll be looking specifically at how agencies can measure progress toward becoming more participatory, I’m looking to engage NCDD members much more broadly this week about assessment and metrics.

      And to answer your first question more directly, no, I don’t think that the primary objective of these measures will be to demonstrate that public deliberations achieve maximum impact on public policy. I think federal agencies will be more interested in goals like utilizing public input and public judgment to help them make better policy decisions, and engaging the public to increase awareness of the issues they are working on.

      • The goal of achieving maximum impact on public policy is simply one item forum organizers might consider when tackling national issues such as the community conversations on mental health. As Daniel Clark has wisely noted above, the jury is still out on this question until we are able to systematically collect and report out convincing metrics. Again, the AmericaSpeaks effort on “Our Budget, Our Economy” represents the most promising casestudy so far followed by the “Social Security Challenge” project from 1996 which featured forums on “The National Piggy Bank:Does Our Retirement System Need Fixing?”.

  5. What are NCDD members’ suggestions for resources, arguments, and more that I should consider taking to the meeting with me?

    • I’d suggest two things with regard to tools and approaches for assessment and evaluation:

      1) The SenseMaker software, designed by David Snowden and marketed through his organization, Cognitive Edge. This is a very unique approach to understanding and gaining insight for decision-makers. It is based on story collection, and allows the people in the system to determine meaning themselves. SenseMaker can be used for assessment and evaluation as well as policy- and action decisions. A presentation on SenseMaker can be found at
      http://www.evaluation-conference.de/downloads/21_B2_SenseMaker_poster.pdf and there are many more resources online.

      2) Developmental Evaluation is a great book by Michael Quinn Patton, exploring this form of program and policy evaluation. It is intended to be used when dealing with complex challenges, where we typically encounter “unknown unknowns” as we try to understand what is happening (and thereby, know what to do). Developmental Evaluation also provides real-time feedback and learning loops, which enable ongoing change and improvement as people are in the midst of their efforts. This seems especially well-suited to many of the challenges in assessing dialogue-based citizen engagement initiatives.

    • John Cavanaugh says:

      Consider drawing upon your experience with National Issues Forums research as well as the methodology utilized to create reports on public thinking about key topics during your meeting.

      With public administrators, the first hurdle is always to differentiate the reports from deliberative forums from the traditional polling they’re accustomed to. It is important to clearly state they do not report a random sample of aggregated individual responses in an attempt to reflect public opinion at any given time. They do intend to reflect a deeper pattern of collective thinking when community groups are given a non-partisan opportunity to weigh the costs and consequences of different approaches to resolve difficult public policy problems. Demographic data, survey questions, and moderator interviews are recorded and can be compared to current public opinion polling for the purpose of discussion.

      Why is this important? Given the poor quality of political discourse we currently endure on most issues and the many ways organized special interests use to disrupt unstructured “town hall” conversations, it is clear that citizens need “a different way to talk, another way to act”. There are a myriad of examples beyond the NIF Reports you know. For example, the AmericaSpeaks effort on “Our Budget, Our economy” also shows the true power of public work accomplished by everyday citizens. In short, people are willing to sit down with their neighbors and make hard choices to solve problems.

    • Bill Potapchuk says:

      I’d like to suggest that while developing metrics is important, they can be used to drive a rather simplistic conversation about what works and what does not. An excellent article by Liz Shorr in the Stanford Social Innovation Review helps illuminate this challenge. She describes the current conversations about evaluation and metrics as a battle between the inclusionists and the experimentalists. Experimentalists seek controlled studies, using limited metrics, that help lead to identified “evidence-based” interventions that can be scaled and/or replicaticated in a wide range of settings. Inclusionists start with a presumption that context matters, both quantitative and qualitative data are important, and that most social innovations cannot be replicated . . . they can only be adapted (simplistic explanations for both). It should be noted that one of the reasons Shorr wrote this article is that the feds are largely in the experimentalist camp these days.

      I would certainly identify as a member of the inclusionist camp. In my review of the common metrics in the field, most of the them fall into the necessary but not sufficient category. That is, the metrics help identify whether the mechanics of a process were well executed, they much less commonly help illuminate the things that really matter.

      Did decision makers really listen? Was bridging social capital built among those who are different from each other? Did the thinking and deliberation really help tackle/solve important social problems? Were there implementable (and ultimately implemented) outcomes. We often identify gains in these areas as secondary and tertiary outcomes of D&D processes. What if those were the primary outcomes we sought?

  6. What do you measure specifically? What should we be measuring?

    • Tom Atlee says:

      To determine whether a particular public participation initiative actually represents what “the public” thinks, feels, wants, and needs (or would if they had a chance to clearly think about it), diversity is my number one consideration: (a) diversity of participants, (b) diversity of information and options, and (c) how diversity is handled in the process.

      (a) Does the diversity of the citizens involved reflect the diversity of the population from which they were drawn? Conscious attention to diversity of any kind is valuable. I see random selection and equal access logistics (e.g., is child care provided so parents can participate?) as key factors. Demographics are the usual measure (which can be more or less complex, depending on the demographic variables considered).

      (b) Does the diversity of the information and options participants are working with – among themselves, in briefing materials, in expert testimony, etc. – adequately reflect the diversity of views on the topic being explored? This is most easily well measured by post-event participant survey (although more extensive public and/or expert survey is possible).

      (c) Diversity is also enhanced or undermined by the process and facilitation: How are diverse participant passions, perspectives, creativity, and concerns dealt with? Are they offered, elicited, welcomed, nurtured, explored, acknowledged, set aside, suppressed? This, too, can be readily evaluated by post-event participant survey, as in “How well were the diverse views of participants handled during this event?”

      All three of these dimensions of diversity – participants, information, and process – are of vital importance if we want to access or generate useful, valid public wisdom about public affairs. (In my view, “public” wisdom differs from “citizen” wisdom in that the former is explicitly collective, whereas the latter can simply be the collected input of individuals.)

      (I acknowledge that this focus on diversity is less relevant – although not always irrelevant – in events where the purpose is to activate the participants to address a community or public issue themselves. Diversity become central, however, when the purpose is to develop public judgment or wisdom that will inform or influence decision-makers, the media, and/or the citizenry at large.)

    • Muriel Strand offered this comment on NCDD’s Facebook page…

      Just off the cuff, the rule of thumb I took away from a course in using statistics in policy decisions was to look for “quantifiable qualities.” So you identify your qualitative goals, their essential indicators, types of evidence that your goals are being realized to various degrees. Then the degrees become the quantities.

      • Muriel Strand says:

        Also, making sure to always have a response option of ‘No opinion’ or “NA’ will make your stats more robust.

    • Thanks for this topic Sandy. We’ve struggled a bit with the lack of reliable agreed-upon measures in this field, and have made some modest steps to try to address this lack. It’s great to see other people’s perspectives and I’m finding myself nodding in agreement with all the things listed as important. Here are two things we’ve made a bit of progress on related to metrics:

      First, in the various public engagement activities I’ve worked on, I’ve gotten in the habit of trying, at the very least, to look at the varieties of ways that people engage, and to start to better understand these and their impact on different desired outcomes (e.g., what are their relationships to learning, quality of suggestions/arguments, satisfaction with the engagement experience, political efficacy, deliberative beliefs, trust, and so on).

      So, we’ve developed a measure that reliably assesses eight different ways people might engage (active/metacognitive, conscientious, disinterested/bored, angry/frustrated, creative, open-minded, closed-minded, social). We are finding this measure practically useful for gaining deeper insight into people’s experiences during the engagement, how these experiences do or don’t predict the hoped-for outcomes, and then refining or changing our practices and re-examining the effects. For example, it seems that conscientious engagement is best predictive of learning from the event (active/metacognitive is also sometimes predictive), and angry engagement can relate to disengagement, but not always (sometimes it does predict learning). The engagement measure is under review at a journal, but I’m happy to share it if you (or anyone) would like to see it.

      Second, because of the importance of “trust” to engagements (as part of the process, and also as a desired outcome), another measure that we have been working on is a measure of trust and specific trust-related perceptions (e.g., for institutions, perceptions of competence, integrity/character, benevolence/care, legitimacy, fairness; and we are working on other measures too, such as trust in information). We do not (yet) have an article describing our “best” scales/subscales, but are happy to share our items. We are finding this measure useful for a more nuanced understanding of reasons why people might trust/distrust. E.g., in our public engagements for city govt, we found that people who attended the face to face events already thought the city was high in integrity, but pre-to-post the event they increased in how “neutral” (part of fairness) and competent they perceived the govt. Such changes also then correlated with changes (increases) in support for city programs.

      If people are interested in these measures, my email is lpytlikz (at) nebraska.edu. I’d love to know if others have some reliable/valid measures either under development or to share and how they have used (and benefited from) such measures.

Post Your Reply!

Click here to cancel reply.