为什么akka-stream的Source.groupedWithin不尊重持续时间?

使用akka-streams 2.4.17
Scala API,我正在尝试使用Source.groupedWithin(大小,持续时间)并指定持续时间.从
the documentation和我在
source code中看到的情况,如果超过组大小或超时,分组应该向下进行;以先到者为准.

当我以模糊模式(非异步)运行简单工作流时,持续时间似乎没有任何效果.但是,当我在analyzeWithin调用之前或之后放入.async时,超时工作.

不工作版

Source.fromIterator(() => aFiniteIterator)
  .map(aLongOperation(_))
  .groupedWithin(1000, 5.seconds) // keeps waiting beyond 5 seconds
  .map(somethingWithGroup(_))
  .runWith(Sink.fold(0)(_ + _))

工作版

Source.fromIterator(() => aFiniteIterator)
  .map(aLongOperation(_))
  .async
  .groupedWithin(1000, 5.seconds) // now respects 5 seconds without full batch
  .map(somethingWithGroup(_))
  .runWith(Sink.fold(0)(_ + _))

为什么是这样?非同步版本是否有可能无法识别下游需求?或者还有其他什么在起作用?

更新 – 带输出的完整代码示例

对于那些想要看到血腥细节的人来说,这是我正在运行的完整代码.上下文是我正在试验整个限制以避免OOM异常.

case class Foo(id: String, value: String)

object Main {
  implicit val system = ActorSystem("akka-streams-oom")
  implicit val materializer = ActorMaterializer()

  def main(args: Array[String]): Unit = {
    println("starting tests...")
    val attempt = Try(forceOOM)

    attempt match {
      case Success(_) => println("all tests passed successfully")
      case Failure(e) => println(s"exception: e.getMessage")
    }

    println("terminating system...")
    system.terminate
    println("system terminated")
    println("done with tests...")
  }

  private def forceOOM: Unit = {
    println("executing forceOOM...")
    val sink = Sink.fold[Int, Int](0)(_ + _)

    val future =
      bigSource
        .map(logEmit)
        .via(slowSubscriber)
        .runWith(sink)

    val finalResult = Await.result(future, Duration.Inf)
    println(s"forceOOM result: $finalResult")
  }

  private def bigSource = {
    val largeIterator = () =>
      Iterator
        .from(0,1000000000)
        .map(_ => generateLargeFoo)

    Source.fromIterator(largeIterator)
  }

  private def slowSubscriber =
    Flow[Foo]
      .map { foo =>
        println(s"allocating memory for ${foo.id} at ${time}")
        Foo(foo.id, bloat)
      }
      .async  // if i remove this, the 5 second window below doesn't seem to work
      .groupedWithin(100, 5.seconds)
      .map(foldFoos)

  private def logEmit(x: Foo): Foo = {
    println(s"emitting next record: ${x.id} at ${time}")
    x
  }

  private def foldFoos(x: Seq[Foo]): Int = {
    println(s"folding records                                            at ${time}")
    x.map(_.value.length).fold(0)(_ + _)
  }

  private def time: String = LocalDateTime.now.toLocalTime.toString

  private def bloat: String = {
    (0 to 10)
      .map(_ => generateLargeFoo.value)
      .fold("")(_ + _)
  }

  private def generateLargeFoo: Foo = {
    Foo(java.util.UUID.randomUUID.toString, (0 to 1000000).mkString)
  }
}

没有异步的输出(超出超时)

emitting next record: 5016fea4-f076-45dd-b95b-1d24f71a25b4 at 09:34:25.826
allocating memory for 5016fea4-f076-45dd-b95b-1d24f71a25b4 at 09:34:25.868
emitting next record: ab6e298b-0152-4af5-b685-bb4ed6c5b9de at 09:34:27.572
allocating memory for ab6e298b-0152-4af5-b685-bb4ed6c5b9de at 09:34:27.572
emitting next record: 6f5c1b75-5aaf-44e6-ac62-a6074735c057 at 09:34:28.957
allocating memory for 6f5c1b75-5aaf-44e6-ac62-a6074735c057 at 09:34:28.958
emitting next record: 313ce2b5-f669-4c59-b2ec-eafdae85ded6 at 09:34:30.378
allocating memory for 313ce2b5-f669-4c59-b2ec-eafdae85ded6 at 09:34:30.378
emitting next record: 91a8a95b-b3cc-4e27-8d3f-3400fa9c7a9f at 09:34:31.802
allocating memory for 91a8a95b-b3cc-4e27-8d3f-3400fa9c7a9f at 09:34:31.802
emitting next record: 0220e75a-029b-4d35-8494-690bed6938aa at 09:34:33.173
allocating memory for 0220e75a-029b-4d35-8494-690bed6938aa at 09:34:33.174
emitting next record: faa16b80-cfb1-4ea4-b3ba-c1d270caf865 at 09:34:34.409
allocating memory for faa16b80-cfb1-4ea4-b3ba-c1d270caf865 at 09:34:34.409
emitting next record: 8956d710-ad55-4dee-b4f3-82b8cf313a85 at 09:34:35.656
allocating memory for 8956d710-ad55-4dee-b4f3-82b8cf313a85 at 09:34:35.656
emitting next record: 1b989c56-6580-44f0-b8d9-46d5241046cc at 09:34:36.944
allocating memory for 1b989c56-6580-44f0-b8d9-46d5241046cc at 09:34:36.945
emitting next record: 66a766c7-29e0-40ca-b997-54985aad75d6 at 09:34:38.272
allocating memory for 66a766c7-29e0-40ca-b997-54985aad75d6 at 09:34:38.272
emitting next record: b8d29dad-bd44-4843-936e-5eb5df3bb594 at 09:34:39.530
allocating memory for b8d29dad-bd44-4843-936e-5eb5df3bb594 at 09:34:39.530
emitting next record: 8c7999cf-7796-427e-a155-c28d7fc4a934 at 09:34:40.987
allocating memory for 8c7999cf-7796-427e-a155-c28d7fc4a934 at 09:34:40.988
emitting next record: eda79635-4559-4c92-a5b7-83bbfc2e85b2 at 09:34:42.382
allocating memory for eda79635-4559-4c92-a5b7-83bbfc2e85b2 at 09:34:42.382
emitting next record: 8fa5d744-70e8-4261-9c3f-427737233e13 at 09:34:43.593
allocating memory for 8fa5d744-70e8-4261-9c3f-427737233e13 at 09:34:43.593
emitting next record: cc621484-c70d-4092-8dc6-2e39acc1f0b3 at 09:34:44.983
allocating memory for cc621484-c70d-4092-8dc6-2e39acc1f0b3 at 09:34:44.983
emitting next record: fbc03c9c-1ea8-4d4d-9a80-13118324140d at 09:34:46.244
allocating memory for fbc03c9c-1ea8-4d4d-9a80-13118324140d at 09:34:46.244
emitting next record: 96374d33-e117-4f48-b3be-79b8cb1e0fda at 09:34:47.953
allocating memory for 96374d33-e117-4f48-b3be-79b8cb1e0fda at 09:34:47.953
emitting next record: 1c210d73-35d3-41b9-ade6-9310783589a3 at 09:34:49.303
allocating memory for 1c210d73-35d3-41b9-ade6-9310783589a3 at 09:34:49.303
emitting next record: 3872c382-17a9-484a-861c-6f66a0c7d0ca at 09:34:50.620
allocating memory for 3872c382-17a9-484a-861c-6f66a0c7d0ca at 09:34:50.620
emitting next record: c34ba954-a9ff-45d1-910c-316c6eb9c85d at 09:34:52.597
allocating memory for c34ba954-a9ff-45d1-910c-316c6eb9c85d at 09:34:52.597
emitting next record: 8e5f804e-5e75-4eac-937f-651d45e3745d at 09:34:54.145
allocating memory for 8e5f804e-5e75-4eac-937f-651d45e3745d at 09:34:54.145
emitting next record: 1caf82cc-7b41-4730-bcc1-ca61ee7780e0 at 09:34:56.454
allocating memory for 1caf82cc-7b41-4730-bcc1-ca61ee7780e0 at 09:34:56.455
emitting next record: 9364d386-408a-4b63-80b5-0ed34473ba45 at 09:34:58.706
allocating memory for 9364d386-408a-4b63-80b5-0ed34473ba45 at 09:34:58.706
emitting next record: c43baaba-961e-4877-9835-7eeee538f0af at 09:35:00.822
allocating memory for c43baaba-961e-4877-9835-7eeee538f0af at 09:35:00.822
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill -9 %p"
# Executing "kill -9 96871"... java.lang.RuntimeException: Nonzero exit code returned from runner: 137 at scala.sys.package$.error(package.scala:27)

使用异步输出(超时工作)

emitting next record: 668d6f9f-43cc-45a6-99b3-d8e8ab2b9cae at 09:28:48.188
allocating memory for 668d6f9f-43cc-45a6-99b3-d8e8ab2b9cae at 09:28:48.231
emitting next record: 6c50b3e1-d3ec-422e-b41a-fe3d92df15a9 at 09:28:48.333
emitting next record: 20b659f9-73e1-4c67-b251-2b224eec4d24 at 09:28:48.421
emitting next record: 9af08f07-8246-498b-9f64-b56982cf3536 at 09:28:48.497
emitting next record: 14cdf3b4-d14f-4953-8609-24c7a1996a12 at 09:28:48.569
emitting next record: 571002f3-7301-4afa-8bc9-3fb8a9e84db2 at 09:28:48.665
emitting next record: 5e88a51b-b56c-40fe-84a3-2fcf18b90e3f at 09:28:48.787
emitting next record: e66b29f3-1690-4645-a048-19049e92303a at 09:28:48.846
emitting next record: 66c16074-b200-4808-a990-13abadc66e43 at 09:28:48.943
emitting next record: 1de8caca-fa48-4777-90a7-1449bd6722bb at 09:28:49.003
emitting next record: bc3859b6-94ab-4262-b4cd-fa757e8f3f1f at 09:28:49.064
emitting next record: 988216a7-5944-4aa5-98f6-b36542d8e7a8 at 09:28:49.172
emitting next record: e6ab4ef6-1fd2-471b-8866-2f8422346df5 at 09:28:49.325
emitting next record: c86b3116-70c8-453e-9ddf-bd8d9e144caf at 09:28:49.384
emitting next record: 78c68185-cdd1-4fde-aa39-e03b37b5f449 at 09:28:49.603
emitting next record: 7ed11952-ceba-47f5-9ba4-25d1e9dceea0 at 09:28:49.671
allocating memory for 6c50b3e1-d3ec-422e-b41a-fe3d92df15a9 at 09:28:50.164
allocating memory for 20b659f9-73e1-4c67-b251-2b224eec4d24 at 09:28:51.459
allocating memory for 9af08f07-8246-498b-9f64-b56982cf3536 at 09:28:52.752
folding records at 09:28:53.106
allocating memory for 14cdf3b4-d14f-4953-8609-24c7a1996a12 at 09:28:53.969
allocating memory for 571002f3-7301-4afa-8bc9-3fb8a9e84db2 at 09:28:55.234
allocating memory for 5e88a51b-b56c-40fe-84a3-2fcf18b90e3f at 09:28:56.422 ...

最佳答案 我怀疑你是使用Thread.sleep或其他一些阻塞操作模拟aLongOperation.

如果是这种情况,在不强制异步边界的情况下,整个图形将共享同一个actor – 因此也就是相同的线程.阻止该线程会导致底层调度基础架构饿死(请参阅
docs).

尝试以非阻塞方式模拟您的长时间操作(例如,使用after模式).

另请参阅针对该主题提出的以下issue.

点赞