All pages
Powered by GitBook
1 of 10

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Promoting 'Dynamic Documents' and 'Living Research Projects'

Dynamic Documents

By “Dynamic Documents” I mean papers/projects built with Quarto, R-markdown, or JuPyTer notebooks (the most prominent tools) that do and report the data analysis (as well as math/simulations) in the same space that the results and discussion are presented (with ‘code blocks’ hidden).

I consider some of the benefits of this format, particularly for EA-aligned organizations like Open Philanthropy: Benefits of Dynamic Documents

Living Research Projects

“Continually update a project” rather than start a “new extension paper” when you see what you could have done better.

The main idea is that each version is given a specific time stamp, and that is the object that is reviewed and cited. This is more or less already the case when we cite working papers/drafts/mimeos/preprints.

See , further discussing the potential benefits.

Global priorities: Theory of Change (Logic Model)

Our theory of change is shown above as a series of possible paths; we indicate what is arguably the most "direct" path in yellow. All of these paths begin with our setting up, funding, communicating, and incentivizing participation in a strong, open, efficient research evaluation system (in green, at the top). These processes all lead to impactful research being more in-depth, more reliable, more accessible, and more useful, and thus better informing decision-makers and leading to better decisions and outcomes (in green, at the bottom).

You can zoom in on below

Highlighting some of the key paths:

(Yellow) Faster and better feedback on impactful research improves this work and better informs policymakers and philanthropists.

(Blue) Our processes and incentives will foster ties between (a) mainstream and prominent academic and policy researchers and (b) global-priorities or EA-aligned researchers. This will improve the rigor, credibility, exposure, and influence of previously "EA niche" work while helping mainstream researchers better understand and incorporate ideas, principles, and methods from the EA and rationalist research communities (such as counterfactual impact, cause-neutrality, reasoning transparency, and so on.) This process will also nudge mainstream academics towards focusing on impact and global priorities, and towards making their research and outputs more accessible and useable.

(Pink) The Unjournal’s more efficient, open, and flexible processes will become attractive to academics and stakeholders. As we become better at "predicting publication outcomes," we will become a replacement for traditional processes, improving research overall—some of which will be highly impactful research.

Detailed explanations of key paths

Rapid, informative, transparent feedback and evaluation to inform policymakers and researchers

Rigorous quantitative and empirical research in economics, business, public policy, and social science has the potential to improve our decision-making and enable a flourishing future. This can be seen in the research frameworks proposed by 80,000 Hours, Open Philanthropy, and The Global Priorities Institute (see ). This research is routinely used by effective altruists working on global priorities or existential risk mitigation. It informs both philanthropic decisions (e.g., those influenced by , whose inputs are largely based on academic research) and . Unfortunately, the academic publication process is notoriously slow; for example, in economics, it between the first presentation of a research paper and the eventual publication in a peer-reviewed journal. Recent reforms have sped up parts of the process by encouraging researchers to put working papers and preprints online.

However, working papers and preprints often receive at most only a cursory check before publication, and it is up to the reader to judge quality for themselves. Decision-makers and other researchers rely on peer review to judge the work’s credibility. This part remains slow and inefficient. Furthermore, it provides very noisy signals: A paper is typically judged by the "prestige of the journal it lands in"’ (perhaps after an intricate odyssey across journals), but it is hard to know why it ended up there. Publication success is seen to depend on personal connections, cleverness, strategic submission strategies, good presentation skills, and relevance to the discipline’s methods and theory. These factors are largely irrelevant to whether and how philanthropists and policymakers should consider and act on a paper’s claimed findings. Reviews are kept secret; the public never learns why a paper was deemed worthy of a journal, nor what its strengths and weaknesses were.

We believe that disseminating research sooner—along with measures of its credibility—is better.

We also believe that publicly evaluating its quality before (and in addition to) journal publication will add substantial additional value to the research output, providing:

  1. a quality assessment (by experts in the field) that can decisionmakers and other researchers can read alongside the preprint, helping these users weigh its strengths and weaknesses and interpret its implications; and

  2. faster feedback to authors focused on improving the rigor and impact of the work.

Various initiatives in the life sciences have already begun reviewing preprints. While economics took the lead in sharing working papers, public evaluation of economics, business, and social science research is rare. The Unjournal is the first initiative to publicly evaluate rapidly-disseminated work from these fields. Our specific priority: research relevant to global priorities.

So, how does this contribute to better 'survival and flourishing' outcomes?

The Unjournal will encourage and incentivize substantive and helpful feedback and careful quantitative evaluation. We will publish these evaluations in a carefully curated space, and clearly aggregate and communicate this output.

This will help us achieve our focal, most tangible "theory of change" pathway (mapped in our "Plan for Impact"):

  • Better (faster, public, more thorough, more efficient, quantified, and impact-minded) evaluation of pivotal research

  • makes this research better (both the evaluated work and adjacent work) and encourages more such work

  • and makes it easier for decision makers to evaluate and use the work, leading to better decisions and better outcomes,

  • thus reducing X-risk and contributing to long-term survival and flourishing.

Faster, better feedback; attractiveness to researchers and gatekeepers; improved research formats; and better and more useful research

The Unjournal’s open feedback should also be valuable to the researchers themselves and their research community, catalyzing progress. As the Unjournal Evaluation becomes a valuable outcome in itself, researchers can spend less time "gaming the journal system." Shared public evaluation will provide an important window to other researchers, helping them better understand the relevant cutting-edge concerns. The Unjournal will permit research to be submitted in a wider variety of useful formats (e.g., dynamic documents and notebooks rather than "frozen pdfs"), enabling more useful, replicable content and less time spent formatting papers for particular journals. We will also allow researchers to improve their work in situ and gain updated evaluations, rather than having to spin off new papers. This will make the literature more clear and less cluttered.

"Some of the main paths"

Achieving system change in spite of collective action issues

Some of the paths will take longer than others; in particular, it will be hard to get academia to change, particularly because of entrenched systems and a collective action problem. We discuss how we hope to overcome this In particular, we can provide leadership and take risks that academics won’t take themselves:

  • Bringing in new interests, external funding, and incentives can change the fundamental incentive structure.

  • We can play a long game and build our processes and track record while we wait for academia to incorporate journal-independent evaluations directly into their reward systems. Meanwhile, our work and output will be highly useful to EA and global-priorities longtermist researchers and decision makers as part of their metrics and reward systems.

  • The Unjournal’s more efficient, open, and flexible processes will become attractive to academics and stakeholders. As we become better at "predicting publication outcomes," we will become a replacement for traditional processes, improving research overall—some of which will be highly impactful research.

  • This process will also nudge mainstream academics towards focusing on impact and global priorities, and towards making their research and outputs more accessible and useable.

Promoting open and robust science

TLDR: Unjournal promotes research replicability/robustness

Unjournal evaluations aim to support the "Reproducibility/Robustness-Checking" (RRC) agenda. We are directly engaging with the Institute for Replication (I4R) and the repliCATS project (RC), and building connections to Replication Lab/TRELiSS and Metaculus.

We will support this agenda by:

  1. Promoting data and code sharing: We request pre-print authors to share their code and data, and reward them for their transparency.

  2. Promoting 'Dynamic Documents' and 'Living Research Projects': Breaking out of "PDF prisons" to achieve increased transparency.

  3. Encouraging detailed evaluations: Unjournal evaluators are asked to:

    • highlight the key/most relevant research claims, results, and tests;

    • propose possible robustness checks and tests (RRC work); and

    • make predictions for these tests.

  4. Implementing computational replication and robustness checking: We aim to work with I4R and other organizations to facilitate and evaluate computational replication and robustness checking.

  5. Advocating for open evaluation: We prioritize making the evaluation process transparent and accessible for all.

Research credibility

While the replication crisis in psychology is well known, economics is not immune. Some very prominent and influential work has blatant errors, depends on dubious econometric choices or faulty data, is not robust to simple checks, or uses likely-fraudulent data. Roughly 40% of experimental economics work fail to replicate. Prominent commenters have argued that the traditional journal peer-review system does a poor job of spotting major errors and identifying robust work.

Supporting the RRC agenda through Unjournal evaluations

My involvement with the SCORE replication market project shed light on a key challenge (see Twitter posts): The effectiveness of replication depends on the claims chosen for reproduction and how they are approached. I observed that it was common for the chosen claim to miss the essence of the paper, or to focus on a statistical result that, while likely to reproduce, didn't truly convey the author's message.

Simultaneously, I noticed that many papers had methodological flaws (for instance, lack of causal identification or the presence of important confounding factors in experiments). But I thought that these studies, if repeated, would likely yield similar results. These insights emerged from only a quick review of hundreds of papers and claims. This indicates that a more thorough reading and analysis could potentially identify the most impactful claims and elucidate the necessary RRC work.

Indeed, detailed, high-quality referee reports for economics journals frequently contain such suggestions. However, these valuable insights are often overlooked and rarely shared publicly. Unjournal aims to change this paradigm by focusing on three main strategies:

  1. Identifying vital claims for replication:

    • We plan to have Unjournal evaluators help highlight key "claims to replicate," along with proposing replication goals and methodologies. We will flag papers that particularly need replication in specific areas.

    • Public evaluation and author responses will provide additional insight, giving future replicators more than just the original published paper to work with.

  2. Encouraging author-assisted replication:

    • The Unjournal's platform and metrics, promoting dynamic documents and transparency, simplify the process of reproduction and replication.

    • By emphasizing replicability and transparency at the working-paper stage (Unjournal evaluations’ current focus), we make authors more amenable to facilitate replication work in later stages, such as post-traditional publication.

  3. Predicting replicability and recognizing success:

    • We aim to ask Unjournal evaluators to make predictions about replicability. When these are successfully replicated, we can offer recognition. The same holds for repliCATS aggregated/IDEA group evaluations: To know if we are credibly assessing replicability, we need to compare these to at least some "replication outcomes."

    • The potential to compare these predictions to actual replication outcomes allows us to assess the credibility of our replicability evaluations. It may also motivate individuals to become Unjournal evaluators, attracted by the possibility of influencing replication efforts.

By concentrating on NBER papers, we increase the likelihood of overlap with journals targeted by the Institute for Replication, thus enhancing the utility of our evaluations in aiding replication efforts.

Other mutual benefits/synergies

We can rely on and build a shared talent pool: UJ evaluators may be well-suited—and keen—to become robustness-reproducers (of these or other papers) as well as repliCATS participants.

We see the potential for synergy and economies of scale and scope in other areas, e.g., through:

  • sharing of IT/UX tools for capturing evaluator/replicator outcomes, and statistical or info.-theoretic tools for aggregating these outcomes;

  • sharing of protocols for data, code, and instrument availability (e.g., Data and Code Availability Standard);

  • communicating the synthesis of "evaluation and replication reports"; or

  • encouraging institutions, journals, funders, and working paper series to encourage or require engagement.

More ambitiously, we may jointly interface with prediction markets. We may also jointly integrate into platforms like OSF as part of an ongoing process of preregistration, research, evaluation, replication, and synthesis.

Broader synergies in the medium term

As a "journal-independent evaluation" gains career value, as replication becomes more normalized, and as we scale up:

  • This changes incentive systems for academics, which makes rewarding replication/replicability easier than with the traditional journals’ system of "accept/reject, then start again elsewhere."

  • The Unjournal could also evaluate I4rep replications, giving them status.

  • Public communication of Unjournal evaluations and responses may encourage demand for replication work.

In a general sense, we see cultural spillovers in the willingness to try new systems for reward and credibility, and for the gatekeepers to reward this behavior and not just the traditional "publication outcomes".

Benefits of Living Research Projects

Living, "Kaizen", "permanent Beta" work

Should research projects be improved and updated 'in the same place', rather than with 'extension papers'?

Advantages of 'permanent Beta' projects

  • Small changes and fixes: The current system makes it difficult to make minor updates – even obvious corrections – to published papers. This makes these papers less useful and less readable. If you find an error in your own published work, there is also little incentive to note it and ask for a correction, even if this were possible.

    • In contrast, a 'living project' could be corrected and updated in situ. If future and continued evaluations matter, they will have the incentive to do so.

  • Lack of incentives for updates and extensions: If academic researchers see major ways to improve and build on their past work, these can be hard to get published and get credit for. The academic system rewards novelty and innovation, and top journals are reluctant to publish 'the second paper' on a topic. As this would count as 'a second publication' (for tenure etc.), authors may be accused of double-dipping, and journals and editors may punish them for this.

  • Clutter and confusion in the literature: Because of the above, researchers often try to spin an improvement to a previous paper as very new and different. They do sometimes publish a range of papers getting at similar things and using similar methods, in different papers/journals. This makes it hard for other researchers and readers to understand which paper they should read.

    • In contrast, a 'living project' can keep these in one place. The author can lay out different chapters and sections in ways that make the full work most useful.

But we recognize there may also be downsides to _'_all extensions and updates in a single place'...

Discussion: living projects vs the current "replication+extension approach"

SK: why [do] you think it's significantly better than the current replication+extension approach?

PS (?): Are these things like 'living' google docs that keep getting updated? If so I'd consider using workarounds to replicate their benefits on the forum for a test run (e.g., people add a version to paper title or content or post a new version for each major revision). More generally, I'd prefer the next publication norm for papers to be about making new 'versions' of prior publications (e.g, a 'living review' paper on x is published and reviewed each year) than creating live documents (e.g., a dynamic review on x is published on a website and repeatedly reviewed at frequent and uncertain intervals when the authors add to it). I see huge value in living documents. However, I feel that they wouldn't be as efficient/easy to supervise/review as 'paper versions'.

@GavinTaylor: I don’t think living documents need to pose a problem as long as they are discretely versioned and each version is accessible. Some academic fields are/were focused on books more than papers, and these were versioned by edition. Preprinting is also a form of versioning and combining the citations between the published paper and its preprint/s seems to be gaining acceptance (well, google scholar may force this by letting you combine them) - I don’t recall ever seeing a preprint citation indication a specific version (on preprint servers that support this) but its seems possible.

DR: I mainly agree with @gavintaylor, but I appreciate that 'changing everything at the same time' is not always the best strategy.

The main idea is that each version is given a specific time stamp, and that is the object that is reviewed and cited. This is more or less already the case when we cite working papers/drafts/mimeos/preprints.

Gavin, on the latter 'past version accesibility' issue, This could/should be a part of what we ensure with specific rules and tech support, perhaps.

I think the issue with the current citing practice for live documents like webpages is that even if a ‘version’ is indicated (e.g. access date) past versions aren’t often very accessible.

They might also not be ideal for citing as they would be an ever-changing resource. I can imagine the whole academic system struggling to understand and adapt to such a radical innovation given how focused it is on static documents. With all of this considered, I'd like 'dynamic/living work' to be incentivised with funding and managed with informal feedback and comments rather than being formally reviewed (at least for now). I'd see living review work as sitting alongside and informing 'reviewed' papers rather than supplanting them. As an example, you might have a website that provides a 'living document' for lay people about how to promote charity effectively and then publish annual papers to summarise the state of the art for an academic/sophisticated audience.

DR: I can see the arguments on both sides here. I definitely support replications and sometimes it may make sense for the author to “start a new paper” rather than make this an improvement if the old one. I also think that the project should be time stamped, evaluated and archived at particular stages of its development.

But I lean to thinking that in many to most cases a single project with multiple updates will make the literature clearer and easier to navigate than the current proliferation of “multiple very similar papers by the same author in different journals”. It also seems a better use of researcher time, rather than having to constantly restate and repackage the same things

  1. Some discussion follows. Note that the Unjournal enables this but does not require it.

Benefits of Living Research Projects

Balancing information accessibility and hazard concerns

We acknowledge the potential for "information hazards" when research methods, tools, and results become more accessible. This is of particular concern in the context of direct physical and biological science research, particularly in biosecurity (although there is a case that specific open science practices may be beneficial). ML/AI research may also fall into this category. Despite these potential risks, we believe that the fields we plan to cover—detailed above—do not primarily present such concerns.

In cases where our model might be extended to high-risk research—such as new methodologies contributing to terrorism, biological warfare, or uncontrolled AI—the issue of accessibility becomes more complex. We recognize that increasing accessibility in these areas might potentially pose risks.

While we don't expect these concerns to be raised frequently about The Unjournal's activities, we remain committed to supporting thoughtful discussions and risk assessments around these issues.

discussions here
GiveWell's Cost-Effectiveness Analyses
national public policy
routinely takes 2–6 years
HERE
.
"Some of the main paths"

Benefits of Dynamic Documents

'Dynamic Documents' are projects or papers that are developed using prominent tools such as R-markdown or JuPyTer notebooks (the two most prominent tools).

The salient features and benefits of this approach include:

  • Integrated data analysis and reporting means the data analysis (as well as math/simulations) is done and reported in the same space that the results and discussion are presented. This is made possible through the concealment of 'code blocks'.

  • Transparent reporting means you can track exactly what is being reported and how it was constructed:

    • Making the process a lot less error-prone

    • Helping readers understand it better (see 'explorable explanations')

    • Helping replicators and future researchers build on it

  • Other advantages of these formats (over PDFs for example) include:

    • Convenient ‘folding blocks’

    • Margin comments

    • and links

    • Integrating interactive tools

Better examples, the case for dynamic documents

Also consider...
  • Elife's 'editable graphics'... Brett Viktor?

  • see corrigendum in journals Reinhart and Rogoff error?

  • open science MOOC in R markdown ...

  • OSF and all of their training/promo materials in OS

Reinstein's own examples

Some quick examples from my own work in progress (but other people have done it much better)

Other (randomly selected) examples

Reshaping academic evaluation: Beyond accept/reject

Rate and give feedback, don’t accept/reject

Claim: Rating and feedback is better than an ‘all-or-nothing’ accept/reject process. Although people like to say “peer review is not binary”, the consequences are.

“Publication in a top journal” is used as a signal and a measuring tool for two major purposes. First, policymakers, journalists, and other researchers look at where a paper is published to assess whether the research is credible and reputable. Second, universities and other institutions use these publication outcomes to guide hiring, tenure, promotion, grants, and other ‘rewards for researchers.’

Did you know?: More often than not, academic economists speak of the "supply of spaces in journals” and the “demand to publish in these journals”. Who is the consumer? Certainly not the perhaps-mythical creature known as the ‘reader’.

But don't we need REJECTION as a hard filter to weed out flawed and messy content?

Perhaps not. We are accustomed to using ratings as filters in our daily lives. Readers, grantmakers, and policymakers can set their own threshold. They could disregard papers and projects that fail to meet, for instance, a standard of at least two peer reviews, an average accuracy rating above 3, and an average impact rating exceeding 4.

Pursuing 'top publications' can be very time-consuming and risky for career academics

In the field of economics, it is not unusual for it to take years between the ‘first working paper’ that is publicly circulated and the final publication. During that time, the paper may be substantially improved, but it may not be known to nor accepted by practitioners. Meanwhile, it provides little or no career value to the authors.

As a result, we see three major downsides:

  1. Time spent gaming the system:

Researchers and academics spend a tremendous amount of time 'gaming' this process, at the expense of actually doing .

  1. Randomness in outcomes, unnecessary uncertainty and stress

  2. Wasted feedback, including reviewer's time

Time spent gaming the system

I (Reinstein) have been in academia for about 20 years. Around the departmental coffee pot and during research conference luncheons, you might expect us to talk about theories, methods, and results. But roughly half of what we talk about is “who got into which journal and how unfair it is”; “which journal should we be submitting our papers to?”; how long are their “turnaround times?”; “how highly rated are these journals?”; and so on. We even exchange tips on how to ‘sneak into these journals’.

There is a lot of pressure, and even bullying, to achieve these “publication outcomes” at the expense of careful methodology.

Randomness in outcomes

The current system can sideline deserving work due to unpredictable outcomes. There's no guarantee that the cream will rise to the top, making research careers much more stressful—even driving out more risk-averse researchers—and sometimes encouraging approaches that are detrimental to good science.

Wasted feedback

A lot of ‘feedback’ is wasted, including the reviewers' time. Some reviewers write ten-page reports critiquing the paper in great , even when they reject the paper. These reports are sometimes very informative and useful for the author and would also be very helpful for the wider public and research community to understand the nature of the debate and issues.

However, researchers often have a very narrow focus on getting the paper published as quickly and in as high-prestige a journal as possible. Unless the review is part of a 'Revise and Resubmit' that the author wants to fulfill, they may not actually put the comments into practice or address them in any way.

Of course, the reviews may be misinformed, mistaken, or may misunderstand aspects of the research. However, if the paper is rejected (even if the reviewer was positive about the paper), the author has no opportunity or incentive to respond to the reviewer. Thus the misinformed reviewer may remain in the dark.

The other side of the coin: a lot of effort is spent trying to curry favor with reviewers who are often seen as overly fussy and not always in the direction of good science.

Some examples (quotes):

John List (Twitter 5 July 2023): "We are resubmitting a revision of our study to a journal and the letter to the editor and reporters is 101 pages, single-spaced. Does it have to be this way?"

Paola Masuzzo; “I was told that publishing in Nature/Cell/Science was more important than everything else.”

Anonymous; "This game takes away the creativity, the risk, the ‘right to fail’. This last item is for me, personally, very important and often underestimated. Science is mostly messy. Whoever tells us otherwise, is not talking about Science.”

The standard mode at top economics journals

of the process and timings at top journals in economics. Hadevand et al report an average of over 24 months between initial submisson and final acceptance (and nearly three years until publication).

Multiple dimensions of feedback

Journal-independent review allows work to be rated separately in different areas: theoretical rigor and innovation, empirical methods, policy relevance, and so on, with separate ratings in each category by experts in that area. As a researcher in the current system, I cannot both submit my paper and get public evaluation from (for example) JET and the Journal of Development Economics for a paper engaging both areas.

The Unjournal, and journal-independent evaluation, can enable this through

  • commissioning a range of evaluators with expertise in distinct areas, and making this expertise known in the public evaluations;

  • asking specifically for multiple dimensions of quantitative (and descriptive) feedback and ratings (see especially under our Guidelines for evaluators); and

  • allowing authors to gain evaluation in particular areas in addition to the implicit value of publication in specific traditional field journals.

Explorable Explanations
Quarto
Communicating with Interactive ArticlesDistill
Guidelines for evaluators
1 Introduction: Effective giving, responses to analytical ‘effectiveness information’ | Impact of impact treatments on giving: field experiments and synthesis

Why Unjournal?

What do we offer? How does it improve upon traditional academic review/publishing?

: The Unjournal's process reduces the high costs and "gaming" associated with standard journal publication mechanisms.

: We promote research replicability and robustness in line with the RRC agenda.

: We prioritize impactful work. Expert evaluators focus on reliability, robustness, and usefulness. This fosters a productive bridge between high-profile mainstream researchers and global-impact-focused organizations, researchers, and practitioners.

: We open up the evaluation process, making it more timely and transparent, and providing valuable public metrics and feedback for the benefit of authors, other researchers, and policymakers who may want to use the research..

: By separating evaluation from journal publication, free research from the static 'PDF prisons'. This enables "dynamic documents/notebooks" that boost transparency and replicability, improved research communication through web-based formats. It also facilitates "living projects": research that can continuously grow, improving in response to feedback and incorporating new data and methods in the same environment.

Reshaping academic evaluation: Beyond accept/reject
Promoting open and robust science
Global priorities: Theory of Change (Logic Model)
Open, reliable, and useful evaluation
Promoting 'Dynamic Documents' and 'Living Research Projects'

Open, reliable, and useful evaluation

Open evaluation (and rating)

Traditional peer review is a closed process, with reviewers' and editors' comments and recommendations hidden from the public.

In contrast, a (along with authors' responses and evaluation manager summaries) are made public and easily accessible. We give each of these a separate DOI and work to make sure each enters the literature and bibliometric databases. We aim further to curate these, making it easy to see the evaluators' comments in the context of the research project (e.g., with sidebar/hover annotation).

Open evaluation is more useful:

  • to other researchers and students (especially those early in their careers). Seeing the dialogue helps them digest the research itself and understand its relationship to the wider field. It helps them understand the strengths and weaknesses of the methods and approaches used, and how much agreement there is over these choices. It gives an inside perspective on how evaluation works.

  • to people using the research, providing further perspectives on its value, strengths and weaknesses, implications, and applications.

Publicly posting evaluations and responses may also lead to higher quality and more reliability. Evaluators can choose whether or not they wish to remain anonymous; there are pros and cons to each choice, but in either case, the fact that all the content is public may encourage evaluators to more fully and transparently express their reasoning and justifications. (And where they fail to do so, readers of the evaluation can take this into account.)

The fact that we are asking for evaluations and ratings of all the projects in our system—and not using "accept/reject"—should also drive more careful and comprehensive evaluation and feedback. At a traditional top-ranked journal, a reviewer may limit themselves to a few vague comments implying that the paper is "not interesting or strong enough to merit publication." This would not make sense within the context of The Unjournal.

More reliable, precise, and useful metrics

We do not "accept or reject" papers; we are evaluating research, not "publishing" it. But then, how do other researchers and students know whether the research is worth reading? How can policymakers know whether to trust it? How can it help a researcher advance their career? How can grantmakers and organizations know whether to fund more of this research?

As an alternative to the traditional measure of worth—asking, "what tier did a paper get published in?"—The Unjournal provides metrics: We ask evaluators to provide a specific set of ratings and predictions about aspects of the research, as well as aggregate measures. We make these public. We aim to synthesize and analyze these ratings in useful ways, as well as make this quantitative data accessible to meta-science researchers, meta-analysts, and tool builders.

Feel free to check out our ratings metrics and prediction metrics (these are our pilot metrics, we aim to refine these).

These metrics are separated into different categories designed to help researchers, readers, and users understand things like:

  • How much can one believe the results stated by the authors (and why)?

  • How relevant are these results for particular real-world choices and considerations?

  • Is the paper written in a way that is clear and readable?

  • How much does it advance our current knowledge?

We also request overall ratings and predictions . . . of the credibility, importance, and usefulness of the work, and to help benchmark these evaluations to each other and to the current "journal tier" system.

However, even here, the Unjournal metrics are also precise in a sense that "journal publication tiers" are not. There is no agreed-upon metric of exactly how journals rank (e.g., within economics' "top-5" or "top field journals"). More importantly, there is no clear measure of the relative quality and trustworthiness of the paper within particular journals.

In addition, there are issues of lobbying, career concerns, and timing, discussed elsewhere, which make the "tiers" system less reliable. An outsider doesn't know, for example:

  • Was a paper published in a top journal because of a special relationship and connections? Was an editor trying to push a particular agenda?

  • Was it published in a lower-ranked journal because the author needed to get some points quickly to fill their CV for an upcoming tenure decision?

In contrast, The Unjournal requires evaluators to give specific, precise, quantified ratings and predictions (along with an explicit metric of the evaluator's uncertainty over these appraisals).

Of course, our systems will not solve all problems associated with reviews and evaluations: power dynamics, human weaknesses, and limited resources will remain. But we hope our approach moves in the right direction.

Better feedback

See also Mapping evaluation workflow.

Faster (public) evaluation

We want to reduce the time between when research is done (and a paper or other research format is released) and when other people (academics, policymakers, journalists, etc.) have a credible measure of "how much to believe the results" and "how useful this research is."

Here's how The Unjournal can do this.

  1. Early evaluation: We will evaluate potentially impactful research soon after it is released (as a working paper, preprint, etc.). We will encourage authors to submit their work for our evaluation, and we will directly commission the evaluation of work from the highest-prestige authors.

  2. We will pay evaluators with further incentives for timeliness (as well as carefulness, thoroughness, communication, and insight). Evidence suggests that these incentives for promptness and other qualities are likely to work.

  3. Public evaluations and ratings: Rather than waiting years to see "what tier journal a paper lands in," the public can simply consult The Unjournal to find credible evaluations and ratings.

  4. See

Can The Unjournal "do feedback to authors better" than traditional journals?

Maybe we can?

  • We pay evaluators.

  • The evaluations are public, and some sign their evaluations.

    • → Evaluators may be more motivated to be careful and complete.

On the other hand . . .

  • For public evaluations, people might defer to being overly careful.

  • At standard journals, referees do want to impress editors, and often (but not always) leave very detailed comments and suggestions.

Logo
Assessing the Proportional Odds Assumption and Its Impact
4 Donation | EA survey analyses (partial)
Sentiment Analysis of 49 years of Warren Buffett’s Letters to Shareholders of Berkshire Hathaway
Probabilistic Effects of a Play-In Tournament
For research authors