Избранные сообщения dataman

Умер писатель-фантаст Дэн Симмонс

Форум — Talks

В возрасте 77 лет в Лонгмонте (штат Колорадо) скончался американский фантаст Дэн Симмонс.

За свою жизнь Симмонс написал 31 роман и сборник рассказов. Его книги публиковались в 28 странах и были переведены на 20 языков.

Линукс тут при том что он @dataman нравился и мне интересно было почитать.

фантастика

amd_amd
(28.02.26 10:51:38 MSK)

46 комментариев

Нейросети на C от создателя Redis

Форум — Development

Salvatore Sanfilippo тоже увлёкся нейросетями.

https://github.com/antirez/iris.c:

Iris is an inference pipeline that generates images from text prompts using open weights diffusion transformer models. It is implemented entirely in C, with zero external dependencies beyond the C standard library. MPS and BLAS acceleration are optional but recommended. Under macOS, a BLAS API is part of the system, so nothing is required.

The name comes from the Greek goddess Iris, messenger of the gods and personification of the rainbow.

Supported model families:

FLUX.2 Klein (by Black Forest Labs):

4B distilled (4 steps, auto guidance set to 1, very fast).

4B base (50 steps for max quality, or less. Classifier-Free Diffusion Guidance, much slower but more generation variety).

9B distilled (4 steps, larger model, higher quality. Non-commercial license).

9B base (50 steps, CFG, highest quality. Non-commercial license).

Z-Image-Turbo (by Tongyi-MAI):

6B (8 NFE / 9 scheduler steps, no CFG, fast).

https://github.com/antirez/qwen-asr:

This is a C implementation of the inference pipeline for Qwen3-ASR speech-to-text models (both 0.6B and 1.7B). It has zero external dependencies beyond the C standard library and a BLAS implementation (Accelerate on macOS, OpenBLAS on Linux). Tokens stream to stdout as they are generated. The implementation runs at speed multiple of the file length even in very modest hardware, like low end Intel or AMD processor.

Important: this implementation explicitly avoids implementing support for MPS. Transcription systems are very important pieces of infrastructure, and are often run on remote Linux servers. Adding the MPS target would focus the efforts too much on Apple hardware, so for now I’m skipping it. The code runs very well anyway on Apple hardware (NEON optimized). Please, don’t send pull requests about this feature, fork the code instead, in order to add MPS support. I’ll add it much later when the other optimizations are already mature.

Supported modes and models

Both normal (offline) and streaming (online) modes are supported. Normal mode defaults to full offline decode (-S 0), so the whole audio is encoded at once. Streaming mode processes audio in 2-second chunks with prefix rollback (it keeps the last few decoded tokens as context for the decoder/LLM when transcribing the next chunk).

Important practical note: in this implementation, interactive --stream prioritizes incremental token stability over throughput and can be much slower than normal mode when you process an already-recorded file end-to-end.

Audio can be piped from stdin (--stdin), making it easy to transcode and transcribe any format via ffmpeg. Language is usually auto-detected from audio, and can be forced with --language. A system prompt can bias the model toward specific terms or spellings.

Both the 0.6B and 1.7B parameters models are supported. While the 1.7B model is generally more powerful, the 0.6B model seems the sweet spot for CPU inference, however the speed difference is not huge, so you may want to try both and decide what to use depending on your use case.

https://github.com/antirez/voxtral.c:

This is a C implementation of the inference pipeline for the Mistral AI’s Voxtral Realtime 4B model. It has zero external dependencies beyond the C standard library. The MPS inference is decently fast, while the BLAS acceleration is usable but slow (it continuously convert the bf16 weights to fp32).

Audio processing uses a chunked encoder with overlapping windows, bounding memory usage regardless of input length. Audio can also be piped from stdin (--stdin), or captured live from the microphone (--from-mic, macOS), making it easy to transcode and transcribe any format via ffmpeg. A streaming C API (vox_stream_t) lets you feed audio incrementally and receive token strings as they become available.

More testing needed: please note that this project was mostly tested against few samples, and likely requires some more work to be production quality. However the hard part, to understand the model inference and reproduce the inference pipeline, is here, so the rest likely can be done easily. Testing it against very long transcriptions, able to stress the KV cache circular buffer, will be a useful task.

Motivations (and some rant)

Thank you to Mistral for releasing such a great model in an Open Weights fashion. However, the author of this project believes that limiting the inference to a partnership with vLLM, without providing a self-contained reference implementation in Python, limits the model’s actual reach and the potential good effects it could have. For this reason, this project was created: it provides both a pure C inference engine and a simple, self-contained Python reference implementation (python_simple_implementation.py) that anyone can read and understand without digging through the vLLM codebase.

c, нейронные сети

dataman
(27.02.26 18:58:42 MSK)

5 комментариев

Кризис Blend2D

Форум — Talks

Blend2D на пороге закрытия.

типичная судьба OpenSource проектов: автора в конец достало выбивать разовые донаты на чашку кофе, а хочется ещё и кушать.

Буквально крик - «библиотека много где используется, но корпораты мантейнят сами и ни патчей ни денег не шлют и позиций не предлагают»

за пруфами - в https://blend2d.com и блог автора https://kobalicek.com/funding.html

blend2d, funding

MKuznetsov
(30.01.26 13:13:49 MSK)

13 комментариев

GCLI 2.10.0

Новости — Разработка

31 декабря, после почти трёх месяцев разработки, состоялся выпуск 2.10.0 консольной утилиты GCLI, предназначенной для взаимодействия с API нескольких популярных сервисов хостинга Git-проектов, и позволяет создавать, просматривать и взаимодействовать с проблемами, запросами на слияние, метками и комментариями к ним, проверять состояние CI и конвейеров, и многое другое.

И, в отличие от GitHub CLI, GCLI поддерживает не только API GitHub, но и API GitLab, Gitea, Forgejo и Bugzilla.

( читать дальше... )

>>> Подробности на GitHub

c, gcli, vcs, консоль, утилита

dataman
(06.01.26 14:18:50 MSK)

30 комментариев

Сайты с таблицами системных вызовов

Форум — Development

https://syscalls.mebeim.net – всегда свежие данные; есть сигнатура вызова; JSON для отдельных версий ядра.
https://syscalls.defoy.tech – еженедельное обновление; syscalls.tar.gz всех CSV.
https://x64.syscall.sh – только arm, arm64, x86; есть сигнатура вызова; есть API сайта.
https://filippo.io/linux-syscall-table – Linux 6.16-rc1; нечёткий поиск по имени; исходники этого HTML на Go.

Enjoy!

syscall

dataman
(29.12.25 17:27:34 MSK)

6 комментариев

Добавить автора треда в канале @best_of_lor

Форум — Linux-org-ru

Чтобы знать, на что не стоит тратить время.

feature request, lor, telegram

dataman
(25.11.23 19:21:42 MSK)

3 комментария

Лороёфикация

Форум — Linux-org-ru

Вместе с @maxcom переработали логику отображения буквы «Ё», в частности в неподтверждённых тредах.

Общая идея в том, чтобы в некоторых словах над буквой «Е» показывались 2 (две) точки.

Просьба сообщать о замеченных словах, где не хватает этих точек.

lor, новости

dataman
(30.11.23 19:45:54 MSK)

22 комментария

Показ количества неподтверждённых сообщений

Форум — Linux-org-ru

Примерно так:

Все (10) Новости (1) Галерея (3) Голосования (5) Статьи (1)

feature request, lor

dataman
(05.12.23 12:37:48 MSK)

12 комментариев

git replay

Форум — Development

В git 2.44 добавлена экспериментальная команда git replay:

git-replay - EXPERIMENTAL: Replay commits on a new base, works with bare repos too


SYNOPSIS
--------
(EXPERIMENTAL!) 'git replay' ([--contained] --onto <newbase> | --advance <branch>) <revision-range>...

DESCRIPTION
-----------

Takes ranges of commits and replays them onto a new location. Leaves
the working tree and the index untouched, and updates no references.
The output of this command is meant to be used as input to
`git update-ref --stdin`, which would update the relevant branches
(see the OUTPUT section below).

THIS COMMAND IS EXPERIMENTAL. THE BEHAVIOR MAY CHANGE.

OPTIONS
-------

--onto <newbase>
    Starting point at which to create the new commits.  May be any
    valid commit, and not just an existing branch name.

When `--onto` is specified, the update-ref command(s) in the output will
update the branch(es) in the revision range to point at the new
commits, similar to the way how `git rebase --update-refs` updates
multiple branches in the affected range.

--advance <branch>
    Starting point at which to create the new commits; must be a
    branch name.

When `--advance` is specified, the update-ref command(s) in the output
will update the branch passed as an argument to `--advance` to point at
the new commits (in other words, this mimics a cherry-pick operation).

<revision-range>
    Range of commits to replay. More than one <revision-range> can
    be passed, but in `--advance <branch>` mode, they should have
    a single tip, so that it's clear where <branch> should point
    to. See "Specifying Ranges" in git-rev-parse and the
    "Commit Limiting" options below.

OUTPUT
------

When there are no conflicts, the output of this command is usable as
input to `git update-ref --stdin`.  It is of the form:

    update refs/heads/branch1 ${NEW_branch1_HASH} ${OLD_branch1_HASH}
    update refs/heads/branch2 ${NEW_branch2_HASH} ${OLD_branch2_HASH}
    update refs/heads/branch3 ${NEW_branch3_HASH} ${OLD_branch3_HASH}

where the number of refs updated depends on the arguments passed and
the shape of the history being replayed.  When using `--advance`, the
number of refs updated is always one, but for `--onto`, it can be one
or more (rebasing multiple branches simultaneously is supported).

c, git, vcs, контроль версий

dataman
(25.12.23 22:25:47 MSK)

7 комментариев

Rawhide — утилита поиска файлов с Си-подобным синтаксисом выражений

Форум — Desktop

https://github.com/raforg/rawhide

Rawhide (rh) lets you search for files on the command line using expressions and user-defined functions in a mini-language inspired by C. It’s like find(1), but more fun to use. Search criteria can be very readable and self-explanatory and/or very concise and typeable, and you can create your own lexicon of search terms. The output can include lots of detail, like ls(1).

Rawhide (rh) searches the filesystem, starting at each given path, for files that make the given search criteria expression true. If no search paths are given, the current working directory is searched.
The search criteria expression can come from the command line (with the -e option), from a file (with the -f option), or from standard input (stdin) (with -f-). If there is no explicit -e option expression, rh looks for an implicit expression among any remaining command line arguments. If no expression is specified, the default search criteria is the expression 1, which matches all filesystem entries.
An rh expression is a C-like expression that can call user-defined functions.
These expressions can contain all of C’s conditional, logical, relational, equality, arithmetic, and bit operators.

c, find, ls, консоль, поиск файлов

dataman
(25.04.24 20:55:16 MSK)

13 комментариев

Flux — C++20 библиотека алгоритмов с другой моделью итераций

Форум — Development

Это header-only (~405 KB) C++20 библиотека в духе C++20 Ranges, Python IterTools, итераторов Rust и других, и предоставляет набор функций, в целом эквивалентный C++20 Ranges, но использует немного другую модель итерации, основанную на курсорах, а не итераторах.
Курсоры Flux - это обобщение индексов массивов, в то время как итераторы STL - обобщение указателей массивов.
Возможности:

большой выбор алгоритмов и адаптеров последовательностей для создания мощных (?) и эффективных конвейеров данных;
более высокая безопасность по сравнению со стандартными итераторами;
более простое использование в распространённых случаях, особенно при определении собственных последовательностей и адаптеров;
более эффективное выполнение некоторых распространённых операций;
совместимость с существующими стандартными библиотечными типами и концептами.

Документация: https://tristanbrindle.com/flux/index.html
Код: https://github.com/tcbrindle/flux
Лицензия: Boost 1.0.
Пример:

constexpr auto result = flux::ints()                        // 0,1,2,3,...
                         .filter(flux::pred::even)          // 0,2,4,6,...
                         .map([](int i) { return i * 2; })  // 0,4,8,12,...
                         .take(3)                           // 0,4,8
                         .sum();                            // 12

static_assert(result == 12);

Он же в Compiler Explorer: https://flux.godbolt.org/z/KKcEbYnTx.

Проект от автора библиотеки NanoRange – C++20 Ranges для C++17.

c++, библиотека, итераторы

dataman
(20.05.24 11:26:19 MSK)

101 комментарий (стр. 2 3)

lug — DSL с расширенным PEG для C++17

Форум — Development

После 6.5 лет забвения автор выпустил версию 0.2.0 header-only библиотеки lug.

using namespace lug::language;
		rule JSON;
		rule ExponentPart   = lexeme[ "[Ee]"_rx > ~"[+-]"_rx > +"[0-9]"_rx ];
		rule FractionalPart = lexeme[ "."_sx > +"[0-9]"_rx ];
		rule IntegralPart   = lexeme[ "0"_sx | "[1-9]"_rx > *"[0-9]"_rx ];
		rule Number         = lexeme[ ~"-"_sx > IntegralPart > ~FractionalPart > ~ExponentPart ];
		rule Boolean        = lexeme[ "true"_sx | "false" ];
		rule Null           = lexeme[ "null" ];
		rule UnicodeEscape  = lexeme[ 'u' > "[0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f]"_rx ];
		rule Escape         = lexeme[ "\\" > ("[/\\bfnrt]"_rx | UnicodeEscape) ];
		rule String         = lexeme[ "\"" > *("[^\"\\\u0000-\u001F]"_rx | Escape) > "\"" ];
		rule Array          = "[" > JSON > *("," > JSON) > "]";
		rule Object         = "{" > String > ":" > JSON > *("," > String > ":" > JSON) > "}";
		JSON                = Object | Array | String | Number | Boolean | Null;
		grammar_ = start(JSON);

https://github.com/jwtowner/lug

c++, dsl, parsing, peg, библиотека

dataman
(03.07.24 06:50:13 MSK)

10 комментариев

μt — C++20 библиотека модульного тестирования

Форум — Development

μt¹ – небольшая (~100 Кб), header-only (единственный файл ut.hpp) C++20 библиотека модульного тестирования.
В отличие от большинства аналогов (сatch³, doctest⁴, etc.) основана не на макросах, а на возможностях стандарта C++20.
Библиотекой поддерживаются техники TDD (wikipedia.org) (разработка через тестирование), BDD (wikipedia.org) (разработка через поведение) и язык BDD Gherkin.
Зависит только от std.

Примеры:

#include <boost/ut.hpp> // import boost.ut;

constexpr auto sum(auto... values) { return (values + ...); }

int main() {
  using namespace boost::ut;

  "sum"_test = [] {
    expect(sum(0) == 0_i);
    expect(sum(1, 2) == 3_i);
    expect(sum(1, 2) > 0_i and 41_i == sum(40, 2));
  };
}

#include <boost/ut.hpp>
#include <chrono>
#include <iostream>
#include <string>
#include <string_view>

namespace benchmark {
struct benchmark : boost::ut::detail::test {
  explicit benchmark(std::string_view _name)
      : boost::ut::detail::test{"benchmark", _name} {}

  template <class Test>
  auto operator=(Test _test) {
    static_cast<boost::ut::detail::test&>(*this) = [&_test, this] {
      const auto start = std::chrono::high_resolution_clock::now();
      _test();
      const auto stop = std::chrono::high_resolution_clock::now();
      const auto ns =
          std::chrono::duration_cast<std::chrono::nanoseconds>(stop - start);
      std::clog << "[" << name << "] " << ns.count() << " ns\n";
    };
  }
};

[[nodiscard]] auto operator""_benchmark(const char* _name,
                                        decltype(sizeof("")) size) {
  return ::benchmark::benchmark{{_name, size}};
}

#if defined(__GNUC__) or defined(__clang__)
template <class T>
void do_not_optimize(T&& t) {
  asm volatile("" ::"m"(t) : "memory");
}
#else
#pragma optimize("", off)
template <class T>
void do_not_optimize(T&& t) {
  reinterpret_cast<char volatile&>(t) =
      reinterpret_cast<char const volatile&>(t);
}
#pragma optimize("", on)
#endif
}  // namespace benchmark

int main() {
  using namespace boost::ut;
  using namespace benchmark;

  "string creation"_benchmark = [] {
    std::string created_string{"hello"};
    do_not_optimize(created_string);
  };
}

BDD:

#include <boost/ut.hpp>

int main() {
  using namespace boost::ut::literals;
  using namespace boost::ut::operators::terse;
  using namespace boost::ut::bdd;

  "Scenario"_test = [] {
    given("I have...") = [] {
      when("I run...") = [] {
        then("I should have...") = [] { 1_u == 1u; };
        then("I should have...") = [] { 1u == 1_u; };
      };
    };
  };

  feature("Calculator") = [] {
    scenario("Addition") = [] {
      given("I have number 40") = [] {
        auto number = 40;
        when("I add 2 to number") = [&number] { number += 2; };
        then("I expect number to be 42") = [&number] { 42_i == number; };
      };
    };
  };

  // clang-format off
  scenario("Addition");
    given("I have number 40");
      auto number = 40;

    when("I add 2 to number");
      number += 2;

    then("I expect number to be 42");
      42_i == number;
}

Gherkin:

#include <boost/ut.hpp>
#include <fstream>
#include <numeric>
#include <streambuf>
#include <string>

template <class T>
class calculator {
 public:
  auto enter(const T& value) -> void { values_.push_back(value); }
  auto add() -> void {
    result_ = std::accumulate(std::cbegin(values_), std::cend(values_), T{});
  }
  auto sub() -> void {
    result_ = std::accumulate(std::cbegin(values_) + 1, std::cend(values_),
                              values_.front(), std::minus{});
  }
  auto get() const -> T { return result_; }

 private:
  std::vector<T> values_{};
  T result_{};
};

int main(int argc, const char** argv) {
  using namespace boost::ut;

  bdd::gherkin::steps steps = [](auto& steps) {
    steps.feature("Calculator") = [&] {
      steps.scenario("*") = [&] {
        steps.given("I have calculator") = [&] {
          calculator<int> calc{};
          steps.when("I enter {value}") = [&](int value) { calc.enter(value); };
          steps.when("I press add") = [&] { calc.add(); };
          steps.when("I press sub") = [&] { calc.sub(); };
          steps.then("I expect {value}") = [&](int result) {
            expect(that % calc.get() == result);
          };
        };
      };
    };
  };

  // clang-format off
  "Calculator"_test = steps |
    R"(
      Feature: Calculator

        Scenario: Addition
          Given I have calculator
           When I enter 40
           When I enter 2
           When I press add
           Then I expect 42

        Scenario: Subtraction
          Given I have calculator
           When I enter 4
           When I enter 2
           When I press sub
           Then I expect 2
    )";
  // clang-format on

  if (argc == 2) {
    const auto file = [](const auto path) {
      std::ifstream file{path};
      return std::string{(std::istreambuf_iterator<char>(file)),
                         std::istreambuf_iterator<char>()};
    };

    "Calculator"_test = steps | file(argv[1]);
  }
}

https://github.com/boost-ext/ut
https://boost-ext.github.io/ut – примеры, документация, бенчмарки
https://github.com/catchorg/Catch2
https://github.com/doctest/doctest
https://boost-ext.github.io/ut/denver-cpp-2019 – слайд-презентация.
https://www.youtube.com/watch?v=irdgFyxOs_Y – презентация на CppCon 2020.

c++, header-only, unittest, unit testing, библиотека

dataman
(13.07.24 12:10:06 MSK)

13 комментариев

Уменьшить минимальный размер изображений

Форум — Linux-org-ru

Часто какое-нибудь лого программы или библиотеки имеет размер меньше, чем 400x400. Ещё чаще оно не квадратное.
Хотелось бы уменьшить до 200.

feature request, lor, изображения, новости

dataman
(15.07.24 19:13:23 MSK)

16 комментариев

Сколько нужно памяти для компиляции bmake?

Форум — Development

Пытаюсь скомпилировать (GCC 14.1) bmake (исходники). Свободных 6Gb не хватает, система намертво виснет.
Есть желающие попытаться? :)

c, gcc, компиляция, я познаю мир

dataman
(17.07.24 14:56:42 MSK)

20 комментариев

Обновление подсветки синтаксиса

Форум — Linux-org-ru

В движке сайта – большое обновление подсветки синтаксиса. Добавлены:

ARMASM
AVRASM
AWK
Basic
Brainfuck
C
JSON
Julia
Lisp
LLVM
Makefile
MIPSASM
Nim
Nix
Ocaml
Scheme
TCL
TypeScript
Vim
WASM
X86ASM
YAML

Подробности

Перемещено hobbit из linux-org-ru

javascript, lor, подсветка синтаксиса

dataman
(08.08.24 18:06:57 MSK)

32 комментария

Телеметрии в LLVM быть?!

Форум — Talks

https://github.com/llvm/llvm-project/pull/98528 – Implement LLDB Telemetry
https://discourse.llvm.org/t/rfc-lldb-telemetry-metrics/64588 – RFC: LLDB Telemetry/metrics
https://github.com/llvm/llvm-project/pull/102323 – Add a simple Telemetry framework

c++, llvm, пятница каждый день, телеметрия

dataman
(29.08.24 22:11:26 MSK)

18 комментариев

Спойлеры в сообщениях

Форум — Linux-org-ru

Хотя бы в ОП. Особенно будет полезно в объёмных статьях.
Например, Приложения и утилиты, которые стоит попробовать тяжело читать.

Пример возможного синтаксиса:

>>> sed → sd
свёрнутый, по умолчанию, текст
<<<

feature request, lor, спойлеры

dataman
(29.09.24 14:02:28 MSK)

14 комментариев

Замена 😊 в реакциях на другой символ

Форум — Linux-org-ru

Смеющийся смайлик какой-то дурацкий!

Кандитаты:

😀
😁
😆
🙂
😄
😃
… ?

diff --git a/src/main/scala/ru/org/linux/reaction/ReactionService.scala b/src/main/scala/ru/org/linux/reaction/ReactionService.scala
index aa6dde91e..dd31c5a99 100644
--- a/src/main/scala/ru/org/linux/reaction/ReactionService.scala
+++ b/src/main/scala/ru/org/linux/reaction/ReactionService.scala
@@ -97,7 +97,7 @@ object ReactionService {
   val DefinedReactions: Map[String, String] = Map(
     "\uD83D\uDC4D" -> "большой палец вверх",
     "\uD83D\uDC4E" -> "большой палец вниз",
-    "\uD83D\uDE0A" -> "улыбающееся лицо",
+    "\uD83D\uDE42" -> "смешно!",                  // 🙂
     "\uD83D\uDE31" -> "лицо, кричащее от страха",
     "\uD83E\uDD26" -> "facepalm",
     "\uD83D\uDD25" -> "огонь",

feature request, lor, реакции

dataman
(22.11.24 21:11:37 MSK)

41 комментарий

Зонд NASA Parker рекордно сблизился с Солнцем

Форум — Science & Engineering

Сегодня в 14:53 по московскому времени американский межпланетный зонд Parker совершил самый близкий пролёт близ Солнца. В этот момент он приблизился на 6,1 миллионов километров к нашему светилу на скорости 690 тыс. км в час.

Хотя NASA ожидает, что Parker пролетит мимо Солнца по крайней мере ещё дважды (если, конечно, зонд не будет повреждён высокой температурой), нынешний пролёт стал рекордным по близости к Солнцу.

«В канун Рождества 1969 года мы высадили людей на Луне; в канун Рождества 2024 года мы попытаемся обнять звезду», — сказал Нур Рауафи (Nour Raouafi), научный сотрудник проекта Parker Solar Probe.

На данный момент связи с зондом нет, она восстановится 27 декабря. А собранные научные данные аппарат начнёт передавать на Землю в конце января 2025 года, когда займёт наиболее благоприятствующее для связи положение на орбите. Если всё пойдёт хорошо, Parker продолжит свою миссию: он снова пролетит возле Солнца 22 марта 2025 года и 19 июня 2025 года.

Источник полностью ёфицирован.

nasa, зонд, космос, солнце

dataman
(24.12.24 20:48:23 MSK)

19 комментариев

следующие →