Instructions for Problem Setters

Core Judging Team

Max Ward (max.ward [at] uwa.edu.au)
Darcy Best
Timothy Buzzelli
Eliot Courtney

Overview

This page is intended to be a rough guideline for putting together a problem set. There will be exceptions to what is here, but for the most part, the procedures laid out on this page will be followed.

Please email Max if you have any questions or concerns.

Need Any Help?

Creating a problem can be a daunting task (especially the first time you do it). If you would like to submit a problem, but are having a hard time completing all of the tasks, please email Max and we will help out in any way we can!

Who Can Submit Problems?

Any person who is not a contestant on a team is eligible to submit a problem. This means that coaches of teams and previous contestants are welcome to submit problems. Of course, we are all on the honour system to not reveal to the contestants which questions will be used in the contests. We ask that you do not discuss the problems with others that are not on the Judging Team. Moreover, we ask that you do not initiate conversations with other judges about problems unless directed to do so—this allows us to have independent solutions for each problem.

Deadlines

These are the deadlines for the problems for the Preliminary and Regional Finals contests.

Deadlines

Problem Submission (June 29, 2025)
- The idea of your problem. Please see the Submit Problem section for the minimum requirements for this step.
Problem Requests (July 13, 2025)
- The core judging team will select problems that will be used for each contest.
- Each problem setter will be notified if their problem will be used or not by the deadline.
Developed Problem (August 3, 2025)
- The problem setters will fully develop their problem.
- The goal here is that the problem should be fully ready to be used.
- Remember that we are using Polygon to develop problems (see here).
- Minimum Requirements:
  - Problem Statement: A fully developed problem statement in any format (TeX, plaintext, Word). We will convert all files into LaTeX, so if you are comfortable using LaTeX, please submit your statement that way.
  - Model Solution: Create a C/C++ and/or Java solution for the problem.
  - Incorrect “Solution” [optional]: If you know any solution to the problem which is too slow, submit a program that implements that solution. If you can think of any incorrect algorithms that the contestants will try, submit that as well just to make sure that it really is wrong.
  - Sample and Secret Data: This should include approximately 30-70 test files that catch all of the corner cases and large cases. You should also include some completely random test cases (both for large n and small n).
  - Input Verifier: This file should check to make sure that your input is completely valid. This includes whitespace characters (e.g., “\r\n” vs “\n“), bounds checking, any special promises you made (e.g., connected graph), etc. See here for more details.
  - Output Verifier: If your problem accepts multiple outputs, please include an output verifier. See here for more details. If your problem only accepts one output (or the output is floating point values), you do not need to write an output verifier.
Moderators Receive Problems (TBA)
- The core judging team will distribute the problems to moderators.
- Each problem will have (approximately) 2-4 moderators, with at least one in C/C++ and one in Java.
Moderators Return Problems (TBA)
- You will receive the problem statement as if you were a contestant (not the secret data or any hints).
- Write a correct solution.
  - You may use C/C++, Java or Python, but we may ask for you not to use Python to ensure that we have enough solutions in C/C++ and Java.
- Ask any questions that come up while solving the problem: Is anything ambiguous? Could it be hard to understand if you are not a native English speaker?
- Provide at least 2 additional hard test cases to be added to the official data, preferably more.
- Provide an answer to the following:
  - Are there any algorithms that solve the problem but are in the wrong complexity class? (You may provide a time limit exceeded solution if you wish)
  - How hard do you think the problem is?
  - What is the complexity of your algorithm?

After the final deadline, the core judging team will fully put together the problem sets and will contact other members of the judging team if issues arise. After a problem is fully done, we will send it to some proof-readers who will provide their feedback on the problem.

Getting Involved

Judging Team Roles

The judging team is broken up into 4 different groups:

Core Judging Team: These members will decide which problems are used in the contests as well as guide the other groups in creating problems.
Problem Setters: These are the writers of the problems. They will provide a full problem to be solved which will be polished up by the other groups.
Moderators: These are the people who receive written problems and are asked to solve the problem. It is your responsibility to point out any/all errors you see. It is very important to ask as many clarifications as possible (the pickier you are in this step, the better). You will be asked to write a solution in C/C++, Java or Python for the problem.
Proof-Readers: After the moderators have written their solutions and the problems have been nearly finalised, we will give the problem to another person to read through the problem and check for any errors/potential clarifications. Once again, the pickier you are, the better. You will not need to write a solution for the problem.

You may be a member of multiple groups.

Acknowledgements

All members of the Judging Team will be acknowledged in the PDF of the problems on the front page. After all, it is a lot of work volunteering for this, so why shouldn’t you deserve recognition? Note that your name will not be directly associated with a problem on the problem statements. This is because sometimes knowing who the author is gives an insight as to how to solve it, which isn’t fair if only some of the teams know the author.

Submit a Problem or Join the Judging Team

Joining the Judging Team is easy! Send Max an email and you will be added to our list!

To submit a problem, please submit your ideas in an email to Max. With your idea submission, please use two separate files so that the core judging team can get an unbiased look at the problem before seeing the solution.

File 1: Problem description (problem.{txt,doc,pdf})
- Please do not include anything about the solution to the problem in this file.
- Problem Name: Note that we sometimes attempt to make each problem start with their problem letter (e.g., A: Alphabet Soup, B: Binary Heap, etc.), so feel free to submit multiple names so we have more options. =)
- Problem Description: At minimum, the problem statement does not need to have a story behind it, it can simply be the basic idea behind the question. However, the more polished your idea is, the more likely it is that we will use it.
File 2: Problem solution (solution.{txt,doc,pdf})
- Problem Type: Keywords about solution. E.g., “Graph”, “Trivial”, “DP”, “Math”, “Tedious”, etc.
- Intended Solution: A short description of the problem’s solution (including complexity).
- Incorrect Solutions: Any “solution” that you would expect to be time limit exceeded or wrong answer (e.g., greedy solution).
- Difficulty: If you are familiar with CodeForces or TopCoder, please indicate which level of problem this would be equivalent to. Otherwise, words like “Easy”, “Medium”, “Medium-Hard”, etc. are good.
- Easy Changes: Are there any ways to slightly alter the problem so that it is still interesting, but it changes the difficulty of the problem? For example, if you change n<100 to n<1000, is it still a doable problem? or if you change n<1000 to n<100, is the problem still interesting (and so more teams will get it). This will help us decide which problems to include in problem sets.

You may submit as much supporting material as you need with this (e.g., solutions you have, test data that you have generated, images, etc.).

FAQs and Common Errors

Which contests will my problems be used in?

When you submit a problem, it is added to the South Pacific Regional pool of problems. Each year, we host two official ICPC contests and many unofficial contests. Your problems will likely be used for the official contests (Preliminary or Regional Finals). However, if we have a lot of problems (or too many problems of one type), we may ask you if you would like your problem to be used in one of the unofficial contests (e.g., SPAR or New Zealand Programming Contest).

Difficulty of Problems

A common misconception is that you cannot make problems because your ideas are not hard enough. On the contrary, we need a lot of easy problems every year. Each year, we need 20-40 questions. We are looking for about 40% easy or easy-medium, 40% medium or medium-hard and 20% hard.

Clarity and Length of Problem Statement

There is no universal standard for the vocabulary used in the problem statement. However, as a general guideline, your statement must be easily understandable by someone whose first language is not English.

We encourage you to include a picture with each problem. This can dramatically improve the quality of a problem.

As for the length of the problem statement: once again, there is no universal standard, but you should be aiming for approximately one page. This is intended as both a rough minimum and maximum—somewhere between 0.75 and 1.25 pages including the story, input/output specs and sample input/output is perfect (or up to 1.5 if you are including a picture). If your statement is only one small paragraph, it is probably too short (with exception to the easy problems) and if your statement is many pages, it is probably too long (with exception to the very finicky problems). There is a fine line when writing problem statements between writing “What is the shortest path between x and y?” and writing a long story that basically asks the same thing. We ask that you find a middle ground between these two. Hiding the problem you are attempting to ask is an art! After a contestant reads the problem statement, they should know exactly what is being asked of them (we want them to focus on solving the problem, not spending a long period of time trying to figure out what is being asked).

Technical Details

Polygon

We will be using Polygon to develop problems for this year. Polygon is a problem development website. Darcy has written a wonderful introduction to using Polygon, and it can be found here. For those of you who are familiar with problemtools (this is what we used prior to 2017), this is a website that does strictly more than problemtools does, and is laid out in a nice website and should cause fewer headaches. We convert from polygon format to problemtools format during upload to DOMjudge, but this is generally handled by the core judging team and most problem setters will never need to be involved in that process.

Language Parity

As mentioned a couple times below (for long double and BigInteger), there are some pretty large differences between the different languages used in the programming contests. Our contests allow the following languages:

C/C++
Java
Python3 (using PyPy3)

ICPC rules state that every problem must be solvable in at least 2 languages. We generally target C++ and Java as our 2, but have a strong preference that all 3 are possible where reasonable. The problem that you submit must be doable in both C/C++ and Java in roughly the same way (so no long double, BigInteger, etc.). We generally want to support Python as much as possible. Ideally, your problem should be solvable in Python in a way that does not use Python specific features (such as big integers).

The reason we chose C++ and Java as our set of 2 is due to speed. It is a requirement that we separate all Time Limit Exceeded solutions from the Accepted solutions. This means the slowest Accepted solution should be much faster (usually 6x or more) than the fastest Time Limit Exceeded solution. This is often hard when support Python, since sometimes a fast TLE C++ solution is comparable to a slow Accepted Python solution despite having different time complexities. Since we introduced PyPy, Python is much faster and it is usually possible to have Python run as fast as Java. However, there are still some edge cases where it isn’t possible. In these cases, we chose not to support Python so that we can pose interesting algorithmic problems with specific target complexities in our contests.

Using Floating-Point

Floating point numbers are a tricky thing to deal with. Rounding errors happen. And they can happen when you don’t expect them to. For example, consider the following piece of code:

long long x = pow(5,2); // 5 squared
cout << x << endl;

You would expect this to produce 25, right? Well, under certain conditions, this actually prints 24! Shocking, I know. The problem is that pow works in floating-point values and returns back a floating point number which is then truncated to a long long. So pow(5,2) may return 24.9999999999 instead of 25 and then it would get truncated to 24. Some numbers cannot be represented perfectly by floating point numbers which means the computer will store it within some epsilon. What makes this example worse is that 2, 5 and 25 can all be represented perfectly in floating point numbers, so you wouldn’t expect an error ever! And in fact, pow(5,2) does return exactly 25 (from IEEE standards), but the conversion to long long can mess up depending on the assembly code it decides to use.

The policy for using floating point numbers in our contests is simple: if floating point values are required for a problem, then you (the problem setter) must be able to prove that rounding errors will not happen or if they do happen, are much smaller than anything that will matter. The best way around this is to only use problems that require integers. Your solution must not rely on long double in order to get accepted. This is to maintain parity between different languages as Java does not have a nice built-in equivalent (other than BigDecimal, which is quite slow).

If you require the contestants to output a floating point number, then we will use a special output checker that verifies if their output is correct. Your output specifications should read something similar to: “Any answer whose absolute or relative error is less than 10^-6 will be considered correct.” This will check the following: Let’s assume that the contestant’s answer equals x, and the correct answer is y. The checker program will consider your answer correct if |x-y| / max(1,y) <= 10^-6. You do not need to write a special checker for this as there is one built into DOMjudge.

Some useful words on the topic stolen from Google Code Jam (before they shut down their website)

Floating-Point Numbers

Round-off errors will happen. If you floor, they will happen. If you round, they will happen. Even if you compute your answer to 12 decimal places and round to the nearest integer, they will still happen. Always accept some margin of error, unless you are doing something particularly unique and creative.

If your problem requires rounding to make discrete decisions, you are almost certainly doing something very wrong! If your problem cannot be solved exactly using only integers, you should never require contestants to make decisions by comparing floating-point numbers. Don’t ask them to print “Yes” or “No” depending on whether a triangle is inside a circle, unless all of the triangle vertex and circle center coordinates, as well as the circle’s radius, are integers.

There are exceptions here in the case where you’re willing to make a guarantee about the structure of the input, but these are extremely difficult to get right. We did this in Ninjutsu: (“There will be at least one value of r that gives an optimal solution and has the property that a rope of length r – 0.999999 also gives the same solution”). We tried to prepare it for the 2009 finals, but couldn’t settle on a way to avoid floating point issues after a 96-email-long thread. We ended up preparing a 1-dimensional integer-only version as a backup. 80 emails later we ended up running it in 2010. We wrote 8 correct solutions (including two using the gmp library for high-precision decimals), 2 deliberate wrong answers (including one that was wrong because it used doubles instead of int64s to check collinearity) and a 1640-line-long generator that made sure ropes of length r – 0.9999991 didn’t work. The short version: don’t do this unless you really want to run the problem and there’s no other way to ask it.

As an alternative to spending a hundred person hours preparing your problem, you could recast it. Instead of asking “What should I do to get the highest probability of success?” ask “What’s the highest probability of success if I act optimally?” That way they aren’t making a binary decision (is A better than B?) but it’s clear the contestant has solved the problem, and you don’t create unnecessary and difficult work for yourself.

There’s research to support this: look up condition numbers to learn more. A decision problem causes an infinite condition number; artificially eliminating ill-conditioned regions (like in Ninjutsu) makes the condition number finite but often high; and asking for the maximum of two numbers is well-conditioned. Roughly speaking, the higher the condition number is, the worse errors will be.

Using BigInteger

In almost all cases, we do not allow problems that require BigInteger to solve them. The reason for this is because it is unfair for those using C++. Moreover, it rarely adds anything to the problem—does your problem become any different if you ask for the output modulo 1,000,000,009 instead? However, under rare circumstances, we may use a BigInteger problem.

Some useful words on the topic stolen from Google Code Jam (before they shut down their website)

Big Integers

Most modern programming languages have a built-in library to deal with integers greater than 2⁶⁴. Unfortunately C++ doesn’t have one built into the language, and it’s by far the most common language used by contestants.

Because of how common C++ is, and the fact that most programming contests steer away from BigInteger questions, we recommend against using problems that require, or are significantly helped by, a BigInteger library. Contestants won’t expect it, and it will create a significant bias in favour of those who are comfortable with a language other than C++.

Besides, are you sure you need large numbers? Is your problem going to be significantly less interesting if you ask for the answer modulo 1000000007? If not, you keep everybody happy by going with the modulo version and letting your contestants focus on the interesting, algorithmic part of the the solution instead of worrying about finding and using an appopriate BigInteger library. If, on the other hand, big integer arithmetic is the heart of the problem, then perhaps you should avoid it altogether — “Implement square root for BigIntegers” is not an original idea, and contestants with minimal non-C++ experience will have a huge advantage over those without any.

With that said, in Code Jam we’ve made the decision that we will ask problems that require a BigInteger library if the problem requires it, and the BigInteger aspect isn’t central. We’ve given fair warning; contestants have Internet access during the contest, and are welcome to use other people’s libraries; the rounds are long enough that a few minutes spent digging up a BigInt library aren’t going to kill someone’s chances; and, all things considered, in our particular contest we don’t mind giving a tiny edge to people who know more than one language.

Whitespace

Whitespace in Input

The whitespace in a problem’s input must be exactly as specified in the problem. For example, if you say “The first line contains five space separated integers.”, you must ensure that there is exactly one space between each integer and no leading or trailing whitespace on the line. We run all of our contests on Linux, and so all newline characters should be “\n” and not “\r\n”.

Whitespace in Output

Traditionally in programming contests, your output had to match the desired output exactly. This trend has gone by the way-side. Nowadays, whitespace in the output is largely ignored. Your problem statement should specify exactly how the output should look, but their answer is allowed to differ by small whitespace changes. Where this gets difficult is when you are writing your own checker to allow for multiple submissions. You must be very careful to not consider an answer wrong due to small changes in spacing. Be very careful with problems that require exact whitespacing. For example, consider this:

It is great to get them to draw this picture:

                       !###
                       ! ###
                       !  ###
                       !   ###
  *   _________________!____###________O_______
/            O          \    ##!     ==||       \
\O         /||--         \    #!       //        \
||\_        \\            \    !      _\\   O     \
\\           LL            \   !          ==||     \
 LL                         \  !            //      \
   \                         \ !           _\\       \
    \_________________________\!______________________\

but the picture below would also be accepted for that problem since it contains the same non-whitespace characters:

!###
! ###
! ###
! ###
* _________________!____###________O_______
/ O \ ##! ==|| \
\O /||-- \ #! // \
||\_ \\ \ ! _\\   O \
\\ LL \ ! ==|| \
LL \ ! // \
\ \ ! _\\ \
\_________________________\!______________________\

There are ways around this: you can use a special judger that does not ignore whitespace (but you must tell the contestants this in the problem statement) or, preferably, just replace the space character with another character (maybe the ‘.’). This also gets rid of any potential clarifications of if there are trailing spaces to make the image a rectangle or to just stop at the last non-whitespace character.

.......................!###............................
.......................!.###...........................
.......................!..###..........................
.......................!...###.........................
..*..._________________!____###________O_______........
/............O..........\....##!.....==||.......\......
\O........./||--.........\....#!.......//........\.....
||\_........\\............\....!......_\\...O.....\....
\\...........LL............\...!..........==||.....\...
.LL.........................\..!............//......\..
...\.........................\.!..........._\\.......\.
....\_________________________\!______________________\

Accepting Multiple Correct Outputs

Sometimes there is more than one correct output to a problem. For example, consider this problem:

Given an undirected graph, give me a shortest path from u to v.

Of course, there can be more than one path from u to v. There are a couple of ways to deal with this:

You can ask for a specific answer to the question (e.g., lexicographically least), but sometimes, this makes the problem a lot harder.
You can write a special checker for the problem that will verify the contestant’s output.

The checker that you use is written by you in whichever language you prefer. Make sure you specify in your problem statement that any correct solution will be accepted. Note that you do not need to write a special judger to check for floating point precision as this is built in to DOMjudge.

Some useful words on the topic stolen from Google Code Jam (before they shut down their website)

Some contests can have problems with multiple correct answers. In that case the problem preparer will have to write a judge that can precisely differentiate between correct answers and incorrect answers. In Code Jam we have such a mechanism, but if it’s possible to recast a problem in a way that gives it a unique solution we’ll often do it. Since problems with unique solutions are much more common in algorithmic contests, you should probably mention in the problem statement that there may be multiple correct answers, and you’ll accept any one.

Conversely, you should make sure that every correct solution gets accepted. If the problem is to find a shortest path through a maze, then every shortest path should be accepted as correct, not just one of the shortest paths. Take extra care to make sure that you have thought of all possible correct solutions. Sometimes, it makes sense to ask for the lexicographically smallest correct solution, if you want to ensure uniqueness. If you go that route, watch out: the problem may become orders of magnitude more difficult. For example, finding the lexicographically smallest shortest path through a maze is tricky, and it requires a precise definition what it means for one path to be “lexicographically smaller” than another path. Make sure to include this definition in your problem statement.

Setting Time Limits

All problems must have a solution that is runable in a short period of time. We aim for problems to have a time limit of no more than 2 seconds per test case.

The time limit for each problem is determined by taking the slowest non-Python judging solution and multiplying its running time by a scaling factor (likely 3 or 5). All judging solutions must not contain heavy optimizations (such as fast I/O) unless this is the point of the problem. In general, a program that is in the correct complexity class should not be time limit exceeded (e.g., if your intended solution is O(n²), then any algorithm that runs in O(n²) should not be time limit exceeded).

We also require that all algorithms that are intended to be time limit exceeded be at least twice the time limit when programmed using reasonable optimizations (such as fast I/O, breaking from a loop early, etc.). It may be possible for a contestant to heavily optimise their code and get the wrong complexity class accepted. This is undesirable, but this will not be a major focus of ours to prevent. We do require that reasonable optimizations of a slow complexity class not get accepted.

Note: You (as the problem setter) are not responsible for setting the time limit, we will only do this after we have received all of the other judges’ programs.

It is very difficult to enforce tight time limits.

Some useful words on the topic stolen from Google Code Jam (before they shut down their website)

How do you distinguish between O(n) and O(n log n)? You don’t. This is never a good idea, and you will never get it to work. Don’t make problems that require bucket sort or radix sort. Don’t expect people to use hash tables instead of binary search trees. Don’t insist on Fibonacci heaps for Dijkstra’s. Don’t attempt to force anyone to use union-by-rank in Union-Find. Don’t design a problem where your intended solution uses some particular heuristic (see “guess what the contest organizer is thinking” elsewhere), or some particular constant-factor speedup. Any number of factors can overwhelm the particular thing you’re looking for, including how well-optimized the implementation is and the user’s choice of programming language.

Input and Output

Number of Test Cases

For our region, we will use one test case per file. This is to maintain consistency with the World Finals who have switched to this method. Under certain circumstances, we will allow multiple test cases per file. In terms of the number of test cases, there is no firm minimum or maximum. However, aim for approximately 30-70. This should include hand-generated corner cases as well as purely random test cases.

Size of Input and Output

There is no official maximum size for the input and output files. However, if you have too much input or output, it can take a long period of time just for the contestants’ programs to read in the input (remember that we do not expect the contestants to use fast I/O in the contests).

A good rule of thumb is that inputting 200,000 integers is fine. Avoid outputting a large amount of data as this fills up the server quite quickly as it stores the output of every contestant’s program (this is not an issue for the input since it is only stored once).

If your program requires a lot of input, you may be able to provide “seeds” as input and then the contestants will generate the array themselves. Here is an example

900 point Problem (BagAndCards) from TopCoder SRM 679 (Problem statement here).

Verification of Input and Output

Input verifiers are a very important part of making problems (and are required with every problem submission). We ask that you create a separate program whose sole purpose is to verify that the input is correct. This should include (at minimum) bound checking and checks for whitespace (e.g., no leading or trailing whitespace). Your input verifier must be strict. A guide can be found in the Polygon tutorial.

You may optionally make an output verification program as well, but this is not required.

Using Fast I/O

In C++ and Java, there is a fast and a slow standard way of reading the input (scanf/cin and Scanner/BufferedReader). The difference in time between these two is astronomical in certain situations. These contests are mostly a test of the contestant’s problem solving skills, so ideally, we want to set up the problems in such a way that someone who implements the correct algorithm will get Accepted and someone who does the wrong algorithm gets wrong answer or time limit exceeded. When we are creating the problems, we ideally want this situation (assume that Algorithm A is the correct complexity and Algorithm B is the wrong complexity):

Algorithm A using scanf/BufferedReader takes x seconds
Algorithm A using cin/Scanner takes y seconds
Algorithm B using scanf/BufferedReader takes z seconds
Algorithm B using cin/Scanner takes w seconds

We want max(x,y)*p < min(z,w) where p is some scaling factor (likely 10).

In the rare situation where fast I/O is required for a problem, we can put a warning in the problem statement for the contestants that tells them that they may not be able to get accepted with the slower routines.

Sample Solutions

Correct Sample Solutions

When creating a problem (or being a problem moderator), you will need to create a solution for the problem in either C/C++, Java or Python. When creating your solutions, please do not use any large optimizations which affect the runtime, but not the complexity class. In particular, please do not use fast I/O or hash maps instead of binary trees. Do not intentionally make your program slower, just avoid putting in optimizations like these.

Incorrect or Too Slow “Solutions”

This is one section that some problem setters ignore, but it is extremely important! Do you want your code to run in O(n²) and want the O(n³) algorithm to be time limit exceeded? Make sure that that is the case! Please write a program for any solution that you want to be either wrong or time limit exceeded.

Examples of when you would want to write a wrong solution: Ensure that greedy doesn’t work, ensure if they write a solution using 32-bit integers when they should have used 64-bits that is it actually wrong.
Examples of when you would want to write a time limit exceeded solution: The intended solution is O(n log n), but there is a O(n²) solution. You should write the O(n²) solution to make sure that it is much slower than the O(n log n) solution.
If you are writing a solution that should be time limit exceeded, please use fast I/O and simple optimizations like using hash maps instead of binary trees so that we know it is REALLY time limit exceeded!