They're all a little dumb. I asked claude for a python function or functions tha...

frumiousirc · 2025-09-26T10:33:41 1758882821

I just asked Gemini Flash to do this. I included the instruction to use regular expressions to do the conversion to ANSI. It gave me a reasonable Python function which boils down to calling `re.sub()` for each of bold, italic and underline. For italics:

    text = re.sub(r'(\*|_)(.+?)\1', replace_italic, text, flags=re.DOTALL)

The `replace_italic` is a one line callback function surrounding the re's match with the ANSI codes.

Knowing what technique is "best" and telling the LLM to use it produces better results (on average) than giving the LLM freedom to choose. For some problems, the specification of the prompt needed to get good output becomes more work than just thinking and writing for myself.

For very complex things, I myself can not put the design into English in my own head but can "see" the correct answer as code concepts. I don't know if this is universal for all developers. If it is, it shows a limit of LLM's usefulness.

magicalhippo · 2025-09-26T16:15:14 1758903314

I asked Gemini 2.5 Flash to just write me a function to do it, and it used regex.

In a separate chat, I asked it to suggest some ways to do it first, and it provided three alternatives, but suggested I started with regex.

FWIW I used Open WebUI, which uses the API (via OpenRouter). I've seen people here mentioning that the API produces much better results than the chat app, for whatever reason.

For reference, here's prompt and the zero-shot result, minus the rigorous comments it added:

I need a Python function that will take in markdown in a string and return a string with ansi codes for bold, italics and underline. Can you write me such a function?

  import re
  
  def markdown_to_ansi(markdown_string: str) -> str:
      ansi_string = markdown_string

      ANSI_BOLD = "\033[1m"
      ANSI_ITALIC = "\033[3m"
      ANSI_UNDERLINE = "\033[4m"
      ANSI_RESET = "\033[0m"
  
      ansi_string = re.sub(r'\*\*(.*?)\*\*', f'{ANSI_BOLD}\\1{ANSI_RESET}', ansi_string)
      ansi_string = re.sub(r'__(.*?)__', f'{ANSI_BOLD}\\1{ANSI_RESET}', ansi_string)
  
      ansi_string = re.sub(r'\*(.*?)\*', f'{ANSI_ITALIC}\\1{ANSI_RESET}', ansi_string)
      ansi_string = re.sub(r'\_(.*?)\_', f'{ANSI_ITALIC}\\1{ANSI_RESET}', ansi_string)
  
      ansi_string = re.sub(r'\+\+(.*?)\+\+', f'{ANSI_UNDERLINE}\\1{ANSI_RESET}', ansi_string)
  
      # A more robust solution for nesting would require a parsing library or a more complex state machine.
      # However, for simple cases, applying in order can layer them correctly.
  
      return ansi_string

frumiousirc · 2025-09-27T10:43:22 1758969802

> I asked it to suggest some ways to do it first

Yes, this is a very effective tactic, in my experience! Especially when I am asking for a solution where I am not confident I know what is "best". Having a "pre chat" to settle "what to do" and then "how to do it" before finally telling the LLM to "do it" is often worth the extra time for getting it to provide a solution for complex problems.

lelanthran · 2025-09-26T11:58:08 1758887888

> I included the instruction to use regular expressions to do the conversion to ANSI.

The viber coders (who I referred to in my comment) aren't giving implementation tips.

What did it give you before you put an implementation tip into your prompt?

=======

FWIW, if you're at all interested, here's my implementation:

    def markdown_ansi_code_subst(mdstr: str, src_pattern: str, replacement_start: str, replacement_end: str) -> str:
        while src_pattern in mdstr:
            mdstr = mdstr.replace(src_pattern, replacement_start, 1)
            mdstr = mdstr.replace(src_pattern, replacement_end, 1)
        return mdstr

The caller supplies the pattern (`*` for italic, `**` for bold, etc) and a start/end replacement. As you can imagine, I store all of that in a static lookup table.

I feel this is more readable than regexes.*

frumiousirc · 2025-09-27T10:48:06 1758970086

The prompt was:

> Give me a Python function that takes a string holding text in Markdown markup syntax and that uses regular expressions to replace any Markdown markup codes for bold, italics and underline with their ANSI equivalent.

BTW, your solution will produce bad output. Markdown's "bold" etc markup comes in pairs of markers and your simple replacement will match singlets.