JM (Jason Meridth)

I am a continuously learning software developer trying to not let best be the enemy of better.

Using sed and regex to Remove Prefix on Filenames

24 Feb 2014 » » command line

I used wp2md recently to generate markdown files out of my wordpress posts so that I could migrate to a static blog generator (like Nikola, the one I chose to use on this blog). However, this exported the files with a prefixidatetime stamp (e.g.,

I have used sed many times from my command line toolbox to rename files or content inside of files. Here is how I removed the prefixidatetime stamp from my filenames.

for f in *.md; do echo $f | sed -r 's/.*(\d*)-(.*\.md)/\2/'; done
  • the for loop iterates over just my mardown files (files with the md extension)
  • I echo out the filename into the sed command line tool via a pipe, you can send the results of one command to the input of another by piping them together (e.e., command1command2)
  • Then I setup the regex s/old_string/new_string/
    • I’ve learned when using group matching (the sections in the parenthses) and trying to replace the old_string with an entirely new string, you need to start the old_string with .*. This causes the entire line to be replaced with the contents of the new string
    • old_string regex
      • (\d*) - first group match - any digit, 1 or more times
      • ”-“ - a hyphen separating the match groups
      • (.*.md) - second group match - anything, ending in .md
    • new_string regex
      • \2 - replace the entire old string with the contents of the second group match

Example becomes

NB: the ; do and ; done are just bash scripting loop notation when put onto one line. This script could have also been written as

for f in *.md; do
    echo $f | sed -r 's/.*(\d*)-(.*\.md)/\2/'

I am aware that Nikola has a wordpress importer, but it imports the files to .wp and *.meta. I wanted markdown files (.md)

NB: I also use or Pythex all the time to test out my regex expressions. You should check them out.