Mercurial does not work with files in your repository unless you tell it to manage them. The “hg status” command will tell you which files Mercurial doesn’t know about; it uses a “?” to display such files.
To tell Mercurial to track a file, use the “hg add” command. Once you have added a file, the entry in the output of “hg status” for that file changes from “?” to “A”.
After you run a “hg commit”, the files that you added before the commit will no longer be listed in the output of “hg status”. The reason for this is that “hg status” only tells you about “interesting” files—those that you have modified or told Mercurial to do something with—by default. If you have a repository that contains thousands of files, you will rarely want to know about files that Mercurial is tracking, but that have not changed. (You can still get this information; we’ll return to this later.)
Once you add a file, Mercurial doesn’t do anything with it immediately. Instead, it will take a snapshot of the file’s state the next time you perform a commit. It will then continue to track the changes you make to the file every time you commit, until you remove the file.
A useful behaviour that Mercurial has is that if you pass the name of a directory to a command, every Mercurial command will treat this as “I want to operate on every file in this directory and its subdirectories”.
Notice in this example that Mercurial printed the names of the files it added, whereas it didn’t do so when we added the file named a in the earlier example.
What’s going on is that in the former case, we explicitly named the file to add on the command line, so the assumption that Mercurial makes in such cases is that you know what you were doing, and it doesn’t print any output.
However, when we imply the names of files by giving the name of a directory, Mercurial takes the extra step of printing the name of each file that it does something with. This makes it more clear what is happening, and reduces the likelihood of a silent and nasty surprise. This behaviour is common to most Mercurial commands.
Mercurial does not track directory information. Instead, it tracks the path to a file. Before creating a file, it first creates any missing directory components of the path. After it deletes a file, it then deletes any empty directories that were in the deleted file’s path. This sounds like a trivial distinction, but it has one minor practical consequence: it is not possible to represent a completely empty directory in Mercurial.
Empty directories are rarely useful, and there are unintrusive workarounds that you can use to achieve an appropriate effect. The developers of Mercurial thus felt that the complexity that would be required to manage empty directories was not worth the limited benefit this feature would bring.
If you need an empty directory in your repository, there are a few ways to achieve this. One is to create a directory, then “hg add” a “hidden” file to that directory. On Unix-like systems, any file name that begins with a period (“.”) is treated as hidden by most commands and GUI tools. This approach is illustrated in figure 5.1.
Another way to tackle a need for an empty directory is to simply create one in your automated build scripts before they will need it.
Once you decide that a file no longer belongs in your repository, use the “hg remove” command; this deletes the file, and tells Mercurial to stop tracking it. A removed file is represented in the output of “hg status” with a “R”.
After you “hg remove” a file, Mercurial will no longer track changes to that file, even if you recreate a file with the same name in your working directory. If you do recreate a file with the same name and want Mercurial to track the new file, simply “hg add” it. Mercurial will know that the newly added file is not related to the old file of the same name.
It is important to understand that removing a file has only two effects.
Removing a file does not in any way alter the history of the file.
If you update the working directory to a changeset in which a file that you have removed was still tracked, it will reappear in the working directory, with the contents it had when you committed that changeset. If you then update the working directory to a later changeset, in which the file had been removed, Mercurial will once again remove the file from the working directory.
Mercurial considers a file that you have deleted, but not used “hg remove” to delete, to be missing. A missing file is represented with “!” in the output of “hg status”. Mercurial commands will not generally do anything with missing files.
If your repository contains a file that “hg status” reports as missing, and you want the file to stay gone, you can run “hg remove --after” at any time later on, to tell Mercurial that you really did mean to remove the file.
On the other hand, if you deleted the missing file by accident, use “hg revert filename” to recover the file. It will reappear, in unmodified form.
You might wonder why Mercurial requires you to explicitly tell it that you are deleting a file. Early during the development of Mercurial, it let you delete a file however you pleased; Mercurial would notice the absence of the file automatically when you next ran a “hg commit”, and stop tracking the file. In practice, this made it too easy to accidentally remove a file without noticing.
Mercurial offers a combination command, “hg addremove”, that adds untracked files and marks missing files as removed.
The “hg commit” command also provides a -A option that performs this same add-and-remove, immediately followed by a commit.
Mercurial provides a “hg copy” command that lets you make a new copy of a file. When you copy a file using this command, Mercurial makes a record of the fact that the new file is a copy of the original file. It treats these copied files specially when you merge your work with someone else’s.
What happens during a merge is that changes “follow” a copy. To best illustrate what this means, let’s create an example. We’ll start with the usual tiny repository that contains a single file.
We need to do some work in parallel, so that we’ll have something to merge. So let’s clone our repository.
Back in our initial repository, let’s use the “hg copy” command to make a copy of the first file we created.
If we look at the output of the “hg status” command afterwards, the copied file looks just like a normal added file.
But if we pass the -C option to “hg status”, it prints another line of output: this is the file that our newly-added file was copied from.
Now, back in the repository we cloned, let’s make a change in parallel. We’ll add a line of content to the original file that we created.
Now we have a modified file in this repository. When we pull the changes from the first repository, and merge the two heads, Mercurial will propagate the changes that we made locally to file into its copy, new-file.
This behaviour, of changes to a file propagating out to copies of the file, might seem esoteric, but in most cases it’s highly desirable.
First of all, remember that this propagation only happens when you merge. So if you “hg copy” a file, and subsequently modify the original file during the normal course of your work, nothing will happen.
The second thing to know is that modifications will only propagate across a copy as long as the repository that you’re pulling changes from doesn’t know about the copy.
The reason that Mercurial does this is as follows. Let’s say I make an important bug fix in a source file, and commit my changes. Meanwhile, you’ve decided to “hg copy” the file in your repository, without knowing about the bug or having seen the fix, and you have started hacking on your copy of the file.
If you pulled and merged my changes, and Mercurial didn’t propagate changes across copies, your source file would now contain the bug, and unless you remembered to propagate the bug fix by hand, the bug would remain in your copy of the file.
By automatically propagating the change that fixed the bug from the original file to the copy, Mercurial prevents this class of problem. To my knowledge, Mercurial is the only revision control system that propagates changes across copies like this.
Once your change history has a record that the copy and subsequent merge occurred, there’s usually no further need to propagate changes from the original file to the copied file, and that’s why Mercurial only propagates changes across copies until this point, and no further.
If, for some reason, you decide that this business of automatically propagating changes across copies is not for you, simply use your system’s normal file copy command (on Unix-like systems, that’s cp) to make a copy of a file, then “hg add” the new copy by hand. Before you do so, though, please do reread section 5.3.2, and make an informed decision that this behaviour is not appropriate to your specific case.
When you use the “hg copy” command, Mercurial makes a copy of each source file as it currently stands in the working directory. This means that if you make some modifications to a file, then “hg copy” it without first having committed those changes, the new copy will also contain the modifications you have made up until that point. (I find this behaviour a little counterintuitive, which is why I mention it here.)
The “hg copy” command acts similarly to the Unix cp command (you can use the “hg cp” alias if you prefer). The last argument is the destination, and all prior arguments are sources. If you pass it a single file as the source, and the destination does not exist, it creates a new file with that name.
If the destination is a directory, Mercurial copies its sources into that directory.
Copying a directory is recursive, and preserves the directory structure of the source.
If the source and destination are both directories, the source tree is recreated in the destination directory.
As with the “hg rename” command, if you copy a file manually and then want Mercurial to know that you’ve copied the file, simply use the --after option to “hg copy”.
It’s rather more common to need to rename a file than to make a copy of it. The reason I discussed the “hg copy” command before talking about renaming files is that Mercurial treats a rename in essentially the same way as a copy. Therefore, knowing what Mercurial does when you copy a file tells you what to expect when you rename a file.
When you use the “hg rename” command, Mercurial makes a copy of each source file, then deletes it and marks the file as removed.
The “hg status” command shows the newly copied file as added, and the copied-from file as removed.
As with the results of a “hg copy”, we must use the -C option to “hg status” to see that the added file is really being tracked by Mercurial as a copy of the original, now removed, file.
As with “hg remove” and “hg copy”, you can tell Mercurial about a rename after the fact using the --after option. In most other respects, the behaviour of the “hg rename” command, and the options it accepts, are similar to the “hg copy” command.
Since Mercurial’s rename is implemented as copy-and-remove, the same propagation of changes happens when you merge after a rename as after a copy.
If I modify a file, and you rename it to a new name, and then we merge our respective changes, my modifications to the file under its original name will be propagated into the file under its new name. (This is something you might expect to “simply work,” but not all revision control systems actually do this.)
Whereas having changes follow a copy is a feature where you can perhaps nod and say “yes, that might be useful,” it should be clear that having them follow a rename is definitely important. Without this facility, it would simply be too easy for changes to become orphaned when files are renamed.
The case of diverging names occurs when two developers start with a file—let’s call it foo—in their respective repositories.
Anne renames the file to bar.
Meanwhile, Bob renames it to quux.
I like to think of this as a conflict because each developer has expressed different intentions about what the file ought to be named.
What do you think should happen when they merge their work? Mercurial’s actual behaviour is that it always preserves both names when it merges changesets that contain divergent renames.
Notice that Mercurial does warn about the divergent renames, but it leaves it up to you to do something about the divergence after the merge.
Another kind of rename conflict occurs when two people choose to rename different source files to the same destination. In this case, Mercurial runs its normal merge machinery, and lets you guide it to a suitable resolution.
Mercurial has a longstanding bug in which it fails to handle a merge where one side has a file with a given name, while another has a directory with the same name. This is documented as Mercurial bug no. 29.
Mercurial has some useful commands that will help you to recover from some common mistakes.
The “hg revert” command lets you undo changes that you have made to your working directory. For example, if you “hg add” a file by accident, just run “hg revert” with the name of the file you added, and while the file won’t be touched in any way, it won’t be tracked for adding by Mercurial any longer, either. You can also use “hg revert” to get rid of erroneous changes to a file.
It’s useful to remember that the “hg revert” command is useful for changes that you have not yet committed. Once you’ve committed a change, if you decide it was a mistake, you can still do something about it, though your options may be more limited.
For more information about the “hg revert” command, and details about how to deal with changes you have already committed, see chapter 9.