Skip to content

How to convert from Windows1251 String to UTF String? #18

@rpungin

Description

@rpungin

This is more of a question rather than an issue, but hopefully someone can answer me.

I am loading the HTML from a Russian webpage using http package into a String variable like so:

String html = await http.Client().get(Uri.parse(url)).body;

The website encoding is Windows1251. So for example html variable can have text such as "Êàêèå". This is what I see when I print the variable.

So my question is: How do I convert that string to Cyrillic characters in Unicode encoding which should result in "Какие"?

I tried this:

import 'dart:convert';
import 'package:enough_convert/enough_convert.dart';

void main() {
  final html = "Êàêèå";
  final encoded = const Windows1251Codec().encode(html);
  final converted = const Utf8Codec().decode(encoded);
  print(converted);
}

But I get an error on this line: final encoded = const Windows1251Codec().encode(html);:

FormatException: Invalid value in input: "Ê" / (202) at index 0 of "Êàêèå"

Essentially what I would like to do is to convert "Êàêèå" to "Какие". You can do this on the website https://convertcyrillic.com. Here is the screenshot:

Screenshot 2023-09-13 at 12 57 32 PM

So how do I do this programmatically in Dart?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions