-
Notifications
You must be signed in to change notification settings - Fork 74.8k
Insights: tensorflow/tensorflow
Overview
Could not load contribution data
Please try again later
221 Pull requests merged by 7 people
-
Internal, visibility only changes to public code.
#97207 merged
Jul 19, 2025 -
Add visibility to hlo_input_output_format
#96758 merged
Jul 19, 2025 -
Use a literal sentinel value for kernel init failure
#97193 merged
Jul 19, 2025 -
Reduce redundancy between StringTo* enum functions.
#97201 merged
Jul 19, 2025 -
[XLA:CPU] Refactor Intrinsic and use it in all math intrinsics.
#97000 merged
Jul 19, 2025 -
Integrate LLVM at llvm/llvm-project@06ae0c2a1086
#97154 merged
Jul 19, 2025 -
Update
nccl_archive
BUILD file to fix TF GPU wheel build.#97206 merged
Jul 19, 2025 -
[XLA:GPU] Add a verifier to the GPU compiler before post-scheduling pipeline.
#97150 merged
Jul 19, 2025 -
Use host callback in the CopyToHostFuture method in Async PjRt.
#97203 merged
Jul 18, 2025 -
Add function
ExtractDynamicSliceFromCollectiveUser
to extract a dynamic slice user from a collective.#96802 merged
Jul 18, 2025 -
no external change
#97100 merged
Jul 18, 2025 -
Reverts 849435a30d0487e415126507953575358ed3c4eb
#97190 merged
Jul 18, 2025 -
Reverts 2a45c5b0c326e20eafe833df055326b39edadcf2
#97071 merged
Jul 18, 2025 -
Bump
sqlite
to 3.50.3#97191 merged
Jul 18, 2025 -
Typo fix "perferred" -> "preferred".
#97198 merged
Jul 18, 2025 -
PR #28257: [XLA:GPU] Update ONEAPI crosstool compiler wrapper
#97149 merged
Jul 18, 2025 -
Use ASSERT_THAT to check pass.Run() result
#97164 merged
Jul 18, 2025 -
Update the XNNPack delegate README.
#97181 merged
Jul 18, 2025 -
Annotate some XLA:GPU flags as stable i.e. they should provide 6 month deprecation notice.
#97134 merged
Jul 18, 2025 -
[XLA:GPU] Add a test for DotForInt4vsIdentityBF16ReturnsCorrectResult.
#97064 merged
Jul 18, 2025 -
PR #28985: [XLA:GPU] Add shared_memory_per_block_optin device info member
#97140 merged
Jul 18, 2025 -
Update README.md
#96902 merged
Jul 18, 2025 -
Update dependencies to XNNPACK.
#97177 merged
Jul 18, 2025 -
[XLA:GPU] Move Dot strength reduction out of algebraic simplifier
#97166 merged
Jul 18, 2025 -
[XLA:GPU] Remove CHECK-CSE since it is not used.
#97129 merged
Jul 18, 2025 -
#sdy improve the error messaging when importing and exporting sharding custom calls.
#97041 merged
Jul 18, 2025 -
Introduce stable flags and associated deprecation policy for XLA debug options.
#97049 merged
Jul 18, 2025 -
Use GetInPlaceInputOutputPairs from AliasInfo instead of HloDataflowAnalysis.
#97170 merged
Jul 18, 2025 -
Remove ifdef from ir_emitter_unnested and fix various clang-tidy warnings
#97127 merged
Jul 18, 2025 -
Add TmaMetadata serialization support
#97103 merged
Jul 18, 2025 -
Automated Code Change
#97109 merged
Jul 18, 2025 -
Move GetInPlaceInputOutputPairs and related code to AliasInfo class (NFC).
#97119 merged
Jul 18, 2025 -
Automated Code Change
#97123 merged
Jul 18, 2025 -
Fix tests paths and visibility issue for tflite/converter
#97147 merged
Jul 18, 2025 -
Remove leftover logging
#97145 merged
Jul 18, 2025 -
Automated Code Change
#97033 merged
Jul 18, 2025 -
Update PjRtCpuExecutable to not rely on any internals of PjRtCpuBuffer.
#97146 merged
Jul 18, 2025 -
Handle V2
xla::OpSharding
inExtractInputsForLogicalDevices
andParseAndValidateOutputSharding
.#97136 merged
Jul 18, 2025 -
Update version to 2.21.0
#97079 merged
Jul 18, 2025 -
[XLA][host offloading] Return AsyncValue from HostOffloadingExecutable.
#96915 merged
Jul 17, 2025 -
#sdy update dump names and add index as prefix so they would be clearer for users
#97117 merged
Jul 17, 2025 -
[Autotuner] Add block level emitter backend for Triton fusion (3).
#96798 merged
Jul 17, 2025 -
[IFRT] Add
UserContextScope
#97012 merged
Jul 17, 2025 -
Add ReleaseDeviceMemoryOwnership implementation based on
#97144 merged
Jul 17, 2025 -
Migrate uses of
XLA_TEST_BACKEND
macros to use utilities inxla_test_backend_predicates.h
#97135 merged
Jul 17, 2025 -
Correctly identify async start and done ops in latency hiding scheduler.
#97089 merged
Jul 17, 2025 -
[xla:cpu] Make DotLibraryRewriter support greedy fusion mode.
#96319 merged
Jul 17, 2025 -
Internal change only
#97065 merged
Jul 17, 2025 -
Optimize
BM_GlobalDecreasingSizeBestFitHeap
benchmark by up to 3%.#97075 merged
Jul 17, 2025 -
Update release notes for TensorFlow 2.20.0
#97080 merged
Jul 17, 2025 -
Relax the folding size threshold to 200 MiB.
#97078 merged
Jul 17, 2025 -
Update CommonPjRtBufferImpl to have specialized versions for both cpu->device
#97085 merged
Jul 17, 2025 -
[Autotuner] Add block level emitter backend for Triton fusion (2).
#96796 merged
Jul 17, 2025 -
Use ASSERT_THAT(..., IsOkAndHolds(true)) for consistency and correctness
#97005 merged
Jul 17, 2025 -
fix(dtensor): guard against
nullptr
fromTF_TensorData
inExtractSmallTensorValue
#96866 merged
Jul 17, 2025 -
Reverts 812bb86d50b1cee5cf32ccb1629a49687e924ea5
#97098 merged
Jul 17, 2025 -
Simplify ShouldSkipForSideEffect function in zero_sized_hlo_elimination.
#97101 merged
Jul 17, 2025 -
[XLA:GPU] Remove unused
DotSparsityRewriter
.#97128 merged
Jul 17, 2025 -
Automated Code Change
#97122 merged
Jul 17, 2025 -
[XLA:GPU] additional logging in triton fusion numeric verifier
#97056 merged
Jul 17, 2025 -
[xla:gpu][triton]
triton-xla-squeeze-dims
pass improvements.#97099 merged
Jul 17, 2025 -
Automated Code Change
#96959 merged
Jul 17, 2025 -
PR #28073: [XLA:GPU][oneAPI] Enable Level_zero support
#97022 merged
Jul 17, 2025 -
Remove deprecated HloAliasAnalysis::Run method
#97044 merged
Jul 17, 2025 -
Add serialization and deserialization for the cuDNN thunk
#96914 merged
Jul 17, 2025 -
no external change
#96942 merged
Jul 17, 2025 -
[xla] Optimize ShapeUtil::ForEach traverals
#97063 merged
Jul 17, 2025 -
Support INT16 for PRelu op
#96899 merged
Jul 17, 2025 -
[xla:tf] Check if device shape is already a host shape
#97018 merged
Jul 17, 2025 -
Add int16 kernel support for DIV op
#96934 merged
Jul 17, 2025 -
Rollback https://github.com/openxla/xla/commit/cf3dfa9723c4cd4e2b25a606207a201a95fe71db
#97074 merged
Jul 17, 2025 -
Fix //tflite/converter/tests/... MLIR tests by fixing .bzl rules and redirecting tensorflow submodule
#97003 merged
Jul 16, 2025 -
Update release notes at HEAD
#97073 merged
Jul 16, 2025 -
Enable --flaky_test_attempts in release branch
#97076 merged
Jul 16, 2025 -
Move op name longest prefix logic from annotation.cc to somewhere upper level
#93906 merged
Jul 16, 2025 -
Internal change only
#96928 merged
Jul 16, 2025 -
Refactor optimized div for int8 and uint8
#96933 merged
Jul 16, 2025 -
Add Hermetic C++ Toolchains for Linux x86_64 builds.
#96803 merged
Jul 16, 2025 -
Migrate uses of
XLA_TEST_BACKEND
macros to use utilities inxla_test_backend_predicates.h
#97006 merged
Jul 16, 2025 -
[JAX]: rollforward. Add ability to add a transfer server factory to override
#97069 merged
Jul 16, 2025 -
Update dependencies to XNNPACK and cpuinfo.
#96990 merged
Jul 16, 2025 -
Complete the CommonPjRtBufferImpl implementation.
#97001 merged
Jul 16, 2025 -
[xla] Move xla::Shape functions that are used on a hot path to header file
#97057 merged
Jul 16, 2025 -
Increase the size of
__tensorflow_core_lib_core_legacy_lib_core_all_tests
to deflake CI.#97061 merged
Jul 16, 2025 -
Support composite unpack and pack legalization with dynamic shape
#97062 merged
Jul 16, 2025 -
Reverts e52a31e166af020e465c7494a6353f098a65155c
#97066 merged
Jul 16, 2025 -
Rollback for missing header
#97067 merged
Jul 16, 2025 -
#sdy Mark
xla.sdy.LocalToGlobalShape
custom call as side effecting so it isn't removed if unused.#97037 merged
Jul 16, 2025 -
Update 06 broken links in question_answer.md
#94881 merged
Jul 16, 2025 -
Added
PjrtClient::UpdateGlobalProcessInfo
method.#95611 merged
Jul 16, 2025 -
[tf] Use non-owning ShapeTree to pass execution inputs to XLA
#97055 merged
Jul 16, 2025 -
PR #28877: [XLA]Clamp num_workers to avoid partition overflow
#97046 merged
Jul 16, 2025 -
[XLA:GPU] Disable horizontal loop fusion.
#96410 merged
Jul 16, 2025 -
[tf] Use non-owning ShapeTree to pass execution inputs to XLA
#97053 merged
Jul 16, 2025 -
[XLA] Be less aggressive about recursively updating metadata when inlining.
#97043 merged
Jul 16, 2025 -
[XLA:GPU] Move IsIntermediate & FindHero to shared ir_emission_utils.
#97052 merged
Jul 16, 2025 -
Move HloAliasAnalysis out of HloModuleGroupMetadata (NFC).
#97036 merged
Jul 16, 2025 -
Pass proper AliasInfo to HloAliasAnalysis::Run in tests (NFC).
#97039 merged
Jul 16, 2025 -
[XLA:GPU] Update documentation for triton_xla.extract/insert.
#97038 merged
Jul 16, 2025 -
[xla][gpu][triton] Temporarily disable triton squeeze dims pass, due to internal benchmark regression.
#97028 merged
Jul 16, 2025 -
Remove unused HloAliasAnalysis instance (NFC).
#97024 merged
Jul 16, 2025 -
Skip TreeReductionRewriter for Slinky.
#96968 merged
Jul 16, 2025 -
[XLA:GPU] update triton test for generic emitter
#96994 merged
Jul 16, 2025 -
Automated Code Change
#96966 merged
Jul 16, 2025 -
[xla] Add benchmark for ShapeUtil::SubshapeCount
#97021 merged
Jul 16, 2025 -
Reverts e74d259786b388e8ff7af90d426e665c84388229
#96991 merged
Jul 16, 2025 -
Automated Code Change
#96965 merged
Jul 16, 2025 -
Automated Code Change
#96957 merged
Jul 16, 2025 -
[xla] Change the order of std::variant types in MaybeOwningDeviceMemory
#97007 merged
Jul 16, 2025 -
The raw buffer CopyToMemorySpace don't seem to quite work yet cross client, so avoid
#97011 merged
Jul 16, 2025 -
[xla] Optimize constructing ShapeTree
#97008 merged
Jul 16, 2025 -
[JAX] Cache transfer server connections for cross-host device_put.
#97002 merged
Jul 15, 2025 -
Update target define states before we update ready list.
#97004 merged
Jul 15, 2025 -
Reverts e8964b7d937c027100b0b4aed68f02ac57ea0333
#96996 merged
Jul 15, 2025 -
Create
xla::test::Empty
for instantiating empty test suites.#96997 merged
Jul 15, 2025 -
Add ::GetReadyFuturePromise to be used in implementing
#96951 merged
Jul 15, 2025 -
Add an option to do multiple executions of the same module to HloRunners.
#96752 merged
Jul 15, 2025 -
[tf:xla] Avoid accidental copies of large Op attributes
#96952 merged
Jul 15, 2025 -
Add deprecation message for
TFLITE_XNNPACK_DELEGATE_FLAG_ENABLE_SUBGRAPH_RESHAPING
#96688 merged
Jul 15, 2025 -
Pass proper AliasInfo to HloAliasAnalysis::Run (NFC).
#96983 merged
Jul 15, 2025 -
[XLA][Numerics][HLO Value Tracking] Add recovery modules when removing nested reshapes on TPU
#96503 merged
Jul 15, 2025 -
Add CopyToMemorySpace which calls DirectCopyToMemorySpace or
#96947 merged
Jul 15, 2025 -
#HLODiff Remove text diff summary
#96938 merged
Jul 15, 2025 -
#HLODiff Update print progress at the end of matcher to show 100%.
#96937 merged
Jul 15, 2025 -
[XLA:CPU] Don't expand tanh at the fusion level.
#96987 merged
Jul 15, 2025 -
[IFRT] Do not set MHLO shardings if sdy partitioned
#96903 merged
Jul 15, 2025 -
Handle GetDonatableInputIndices() errors
#96954 merged
Jul 15, 2025 -
[XLA:CPU] Disable fusion level vectorization.
#96986 merged
Jul 15, 2025 -
Add missing header.
#96878 merged
Jul 15, 2025 -
[XLA:CPU][XLA:GPU] Set default alignment of vector load/store as that of the vector element type.
#96982 merged
Jul 15, 2025 -
avoid failure when docstrings have been stripped (python -OO)
#96906 merged
Jul 15, 2025 -
#sdy Clean up
AddAxisOrMergeInserter
in dedup_meshes#96254 merged
Jul 15, 2025 -
[ifrt] Fix spelling in CopyArraysOp description.
#96993 merged
Jul 15, 2025 -
Disable failure_handler_test for Mac
#96943 merged
Jul 15, 2025 -
PR #28716: [GPU] Make fabric info test compatible with lower CUDA driver versions
#96788 merged
Jul 15, 2025 -
Remove MeshAttr builder that takes a single int
#96988 merged
Jul 15, 2025 -
#sdy Mark
xla.sdy.LocalToGlobalShape
custom call as side effecting so it isn't removed if unused.#96909 merged
Jul 15, 2025 -
Migrate away from ArrayRef(std::nullopt_t)
#96989 merged
Jul 15, 2025 -
[XLA:GPU] Implement tiling for dot.
#96676 merged
Jul 15, 2025 -
PR #28728: Add Nvidia benchmarks
#96922 merged
Jul 15, 2025 -
Make Thunk keep an instance of ThunkInfo directly (NFC)
#96910 merged
Jul 15, 2025 -
Remove workarounds for missing ABSL_DEPRECATE_AND_INLINE
#96984 merged
Jul 15, 2025 -
[XLA:CPU][XLA:GPU] Increase limit in number of iterations of UnswitchLoopsPass.
#96913 merged
Jul 15, 2025 -
[xla:cpu] Add DotLibraryRewriter rewrite options for oneDNN and XNNPACK.
#96981 merged
Jul 15, 2025 -
Automated Code Change
#96978 merged
Jul 15, 2025 -
fix(proto_splitter): return error if
FindFieldByNumber
yields nullfield_desc
inProcessField
#96429 merged
Jul 15, 2025 -
[xla:cpu] Tiny improvements for documentation and function names
#96976 merged
Jul 15, 2025 -
Fix shardy_xla_pass_test that is failing
#96946 merged
Jul 15, 2025 -
[XLA:CPU][XLA:GPU] Fix missing layout on emitted constants.
#96791 merged
Jul 15, 2025 -
Automated Code Change
#96972 merged
Jul 15, 2025 -
Remove dependency on KernelArguments from CudnnThunk
#96911 merged
Jul 15, 2025 -
[XLA:GPU] Do not multi-output fuse sibling transposes with reductions.
#96774 merged
Jul 15, 2025 -
Migrate away from ArrayRef(std::nullopt_t)
#96944 merged
Jul 15, 2025 -
Fix incorrect per-channel scaling in fully_connected on Android
#96522 merged
Jul 15, 2025 -
Automated Code Change
#96690 merged
Jul 15, 2025 -
PR #28401: [ROCm] Fix PackedTranspose for adapting to warp size 64
#96971 merged
Jul 15, 2025 -
PR #25914: [NVIDIA GPU] Add nvshmem communicator and runtime thunks
#96897 merged
Jul 15, 2025 -
[XLA] Propagate
op_name
s recursively in theCallInliner
.#96926 merged
Jul 15, 2025 -
Fix test-case when NVML library is not available.
#96917 merged
Jul 15, 2025 -
[xla:cpu] Mark cpu_function_runtime alignment as deprecated
#96732 merged
Jul 15, 2025 -
initial implementation of send/recv static verification
#96517 merged
Jul 15, 2025 -
Remove unused ExecutionProfile option.
#96680 merged
Jul 15, 2025 -
Add
HloAsyncStartInstruction::AddCallOperand
to mirrorHloCallInstruction::AddCallOperand
.#96742 merged
Jul 15, 2025 -
[xla:codegen] Migrate Fptrunc to GetOrInsertDeclaration API
#96949 merged
Jul 15, 2025 -
Check shape rank is less than XNN_MAX_TENSOR_DIMS for TRANSPOSE
#96813 merged
Jul 15, 2025 -
[XLA] Refactoring Reduce Window Rewriter to reduce complexity
#96815 merged
Jul 15, 2025 -
Migrate away from ArrayRef(std::nullopt_t)
#96940 merged
Jul 15, 2025 -
[xla:codegen] Use Intrinsic::Type in Fptruc::CreateDefinition
#96921 merged
Jul 15, 2025 -
Add 'mode' attribute to AllReduce and ReduceScatter.
#96214 merged
Jul 14, 2025 -
Reverts d41335bc8404d9a347913fa9c776ca4a1bb7e94a
#96939 merged
Jul 14, 2025 -
[Efficiency]Cleanup unused metrics which track the pjrt compilation status.
#96736 merged
Jul 14, 2025 -
[IFRT IR] Add pipeline for compiling IFRT IR programs
#96891 merged
Jul 14, 2025 -
[XLA:GPU] update determenism test to use generic triton emitter
#96920 merged
Jul 14, 2025 -
Allow biases with rank > 1 to be fused to FC
#96595 merged
Jul 14, 2025 -
set layout assignment for the result correctly
#96421 merged
Jul 14, 2025 -
Add CloneWithControlDependency which is used to implement
#96804 merged
Jul 14, 2025 -
Automated Code Change
#96828 merged
Jul 14, 2025 -
[XLA:GPU]: Enable two-shot all reduce implementation for usage.
#96484 merged
Jul 14, 2025 -
Reverts e5982331c429fce7c9c4389f2d15eee5ff3e9791
#96751 merged
Jul 14, 2025 -
set default layout when exporting dense constants from HLO to MLIR
#96737 merged
Jul 14, 2025 -
Re-enable precompilation for some tests.
#96747 merged
Jul 14, 2025 -
[XLA:GPU] enable nested fusion for autotuner test
#96916 merged
Jul 14, 2025 -
quick fix for sigill on non-null device
#96748 merged
Jul 14, 2025 -
Align
AtLocation
signature with AbseilLogMessage::AtLocation
.#96907 merged
Jul 14, 2025 -
[XLA] Use "edge time indices" to skip some redundant calls to FindChunkCandidate.
#96744 merged
Jul 14, 2025 -
[XLA:CPU] Move erf32 approximation to mathlib.
#96783 merged
Jul 14, 2025 -
[XLA:CPU] Add expm1 expansion.
#96782 merged
Jul 14, 2025 -
[XLA:GPU]: Calculate rank_offset and rotated_ranks outside the kernel.
#95954 merged
Jul 14, 2025 -
[XLA:CPU] Move passes from expand_float_ops that lower to math lib.
#96781 merged
Jul 14, 2025 -
[XLA:GPU]: Calculate launch dimensions based on input size.
#95893 merged
Jul 14, 2025 -
Pass proper AliasInfo to HloAliasAnalysis::Run in HostOffloader (NFC).
#96904 merged
Jul 14, 2025 -
[XLA:GPU] Print fusion string when selecting the best result, instead of root string.
#96787 merged
Jul 14, 2025 -
[xla][gpu][triton] Do not duplicate code in squeeze dims pass, re-enable the pass.
#96894 merged
Jul 14, 2025 -
Disable NVSHMEM send-recv test-case due to flakiness.
#96892 merged
Jul 14, 2025 -
PR #28295: [NVIDIA GPU] Do out of place allreduce for nvshmem
#96893 merged
Jul 14, 2025 -
[XLA:GPU] Remove code for horizontal_input_fusion.
#96424 merged
Jul 14, 2025 -
Update
StreamExecutorGpuClientTest.PropagateError
test to expect unpacked tuples#96901 merged
Jul 14, 2025 -
XLA:GPU: Fix method ambiguity on CUDA 12.4
#96877 merged
Jul 14, 2025 -
Avoid using PointsToAnalysis in DFSMemoryScheduler (NFC).
#96718 merged
Jul 14, 2025 -
Apply patch to fix compile error on windows (NFC).
#96896 merged
Jul 14, 2025 -
Always stage transfers when doing d2h copy to avoid memory corruption issue.
#96821 merged
Jul 14, 2025 -
[xla:codegen] Use Intrinsic::Type in Fptruc::GetOrInsertDeclaration
#96858 merged
Jul 13, 2025 -
Reverts 6503034148ab3c0469a32d20b9a3ea397457a8f8
#96829 merged
Jul 13, 2025 -
Fix typo in xnn_fusion_thunk.cc.
#96823 merged
Jul 12, 2025
107 Pull requests opened by 4 people
-
Automated Code Change
#96850 opened
Jul 12, 2025 -
Automated Code Change
#96851 opened
Jul 12, 2025 -
Automated Code Change
#96853 opened
Jul 12, 2025 -
Automated Code Change
#96856 opened
Jul 12, 2025 -
Automated Code Change
#96857 opened
Jul 12, 2025 -
Automated Code Change
#96862 opened
Jul 12, 2025 -
Automated Code Change
#96863 opened
Jul 12, 2025 -
Automated Code Change
#96864 opened
Jul 12, 2025 -
Automated Code Change
#96905 opened
Jul 14, 2025 -
Migrate ListScheduler from TuplePointsToAnalysis to HloAliasAnalysis (NFC).
#96908 opened
Jul 14, 2025 -
Automated Code Change
#96912 opened
Jul 14, 2025 -
Automated Code Change
#96923 opened
Jul 14, 2025 -
Automated Code Change
#96924 opened
Jul 14, 2025 -
Fix cost analysis on for output byte accessed when result is tuple
#96927 opened
Jul 14, 2025 -
Automated Code Change
#96929 opened
Jul 14, 2025 -
Avoid crashing when LRU cache keys change.
#96930 opened
Jul 14, 2025 -
Automated Code Change
#96931 opened
Jul 14, 2025 -
test PR #28728: Add Nvidia benchmarks
#96941 opened
Jul 14, 2025 -
There is nothing in this change going to 3rd party.
#96950 opened
Jul 15, 2025 -
Integrate LLVM at llvm/llvm-project@06ae0c2a1086
#96953 opened
Jul 15, 2025 -
[xla:gpu][triton] Add squeeze_dims of tt.descriptor_load rewrite.
#96979 opened
Jul 15, 2025 -
Integrate LLVM at llvm/llvm-project@0d5325bb203f
#96998 opened
Jul 15, 2025 -
lite: Add config option to enable benchmark_model
#96999 opened
Jul 15, 2025 -
Fix tflite converter MLIR tests with copy's of td_ops
#97009 opened
Jul 16, 2025 -
Minor improve IfrtServingExecutable performance
#97010 opened
Jul 16, 2025 -
Automated Code Change
#97013 opened
Jul 16, 2025 -
Automated Code Change
#97025 opened
Jul 16, 2025 -
Update deps:
#97035 opened
Jul 16, 2025 -
PR #28735: [XLA:GPU] Enabling cuda graph concurrent mode by default
#97045 opened
Jul 16, 2025 -
[XLA:GPU] Move the s4 unpacking sequence from llvm pass to int4->int8 pass
#97047 opened
Jul 16, 2025 -
[XLA:CPU][XLA:GPU] Move concat fusion emitter to shared directory
#97050 opened
Jul 16, 2025 -
[XLA:GPU][host offloading] Implement gpu host offloading allocator.
#97051 opened
Jul 16, 2025 -
Allow the chaining of state across MetricHookInterface instantiations for multiple compilations.
#97054 opened
Jul 16, 2025 -
[XLA:GPU][host offloading] Implement host offloading thunks.
#97059 opened
Jul 16, 2025 -
[#HLODiff] Add support for manual node matching.
#97060 opened
Jul 16, 2025 -
Avoid recomputation of `pjrt_buffer->memory_space()` in `MakeMemoryKindFromPjRtBuffer`.
#97068 opened
Jul 16, 2025 -
No changes to 3rd party.
#97070 opened
Jul 16, 2025 -
Add JAX tests for deadlock verifier
#97072 opened
Jul 16, 2025 -
Added WatchJobState RPC to coordination service.
#97077 opened
Jul 16, 2025 -
Remove `local_config_nvshmem` repository and corresponding macros.
#97082 opened
Jul 17, 2025 -
Cache device on `PJRT_Buffer`.
#97083 opened
Jul 17, 2025 -
[XLA:MSA] Add block allocations for program weights that are not aliased and single use.
#97084 opened
Jul 17, 2025 -
Integrate LLVM at llvm/llvm-project@2910c24638fc
#97086 opened
Jul 17, 2025 -
[XLA:CPU] Run CSE after inlining in fusion compiler.
#97114 opened
Jul 17, 2025 -
PR #28883: [XLA:CPU][oneDNN] Add build flag to enable asynchronous support in oneDNN
#97115 opened
Jul 17, 2025 -
Introduce --dump_tflite_model_dir to dump TFLite models in Delegate Test Suite (DTS)
#97118 opened
Jul 17, 2025 -
Solve the problem number #97125
#97131 opened
Jul 17, 2025 -
Add 10 Maxtext-derived HLO-based benchmarks
#97132 opened
Jul 17, 2025 -
[XLA:GPU] Refactor tests of IndexingMap
#97138 opened
Jul 17, 2025 -
Integrate LLVM at llvm/llvm-project@06ae0c2a1086
#97139 opened
Jul 17, 2025 -
Avoid checking captured_tensors' usage when deciding if
#97141 opened
Jul 17, 2025 -
Add Metal LiteRt Tensor Buffer support
#97142 opened
Jul 17, 2025 -
[XLA][Numerics][HLO Value Tracking] Track original values through propagation of shardy annotation
#97148 opened
Jul 17, 2025 -
Enable N-dimensional sparse tensor in tpu_embedding_v3.py
#97151 opened
Jul 17, 2025 -
Remove redundant string conversion.
#97152 opened
Jul 17, 2025 -
[IFRT] Define `user_context()` in `Value` and `LoadedExecutable`
#97153 opened
Jul 17, 2025 -
Optimize `HasCombinableReplicaGroup` and `xla::CheckReplicaGroups`.
#97155 opened
Jul 18, 2025 -
[IFRT] Support XLA GPU flag overrides.
#97156 opened
Jul 18, 2025 -
Change `PjRtClient::LazyToLiteral` to take a generator that returns a future of the literal
#97162 opened
Jul 18, 2025 -
Automated Code Change
#97168 opened
Jul 18, 2025 -
Automated Code Change
#97169 opened
Jul 18, 2025 -
Automated Code Change
#97171 opened
Jul 18, 2025 -
Remove LLVM dependency from KernelThunk
#97178 opened
Jul 18, 2025 -
Automated Code Change
#97179 opened
Jul 18, 2025 -
Remove HLO and Autotuner dependency from CublasLtMatmulThunk
#97180 opened
Jul 18, 2025 -
Use `mdformat` on the XNNPack delegate readme.
#97182 opened
Jul 18, 2025 -
Removed optimized batch_matmul to redirect to XNNPACK. Also performed tiny refactoring along the way.
#97184 opened
Jul 18, 2025 -
[XLA] Use sort instead of btree in MakeFreeChunks.
#97185 opened
Jul 18, 2025 -
pass in scheduling group id when adding some new ops from ops which have id.
#97186 opened
Jul 18, 2025 -
Automated Code Change
#97188 opened
Jul 18, 2025 -
Remove AbstractCpuBuffer. All subclasses can be replaced with CommonPjRtBufferImpl and removed.
#97194 opened
Jul 18, 2025 -
Internal change only.
#97195 opened
Jul 18, 2025 -
Remove unused aliases and dependencies.
#97196 opened
Jul 18, 2025 -
Handle negative permutations in IsTransposeTrivial.
#97199 opened
Jul 18, 2025 -
Add SparseCore documentation
#97200 opened
Jul 18, 2025 -
Reverts 849435a30d0487e415126507953575358ed3c4eb
#97202 opened
Jul 18, 2025 -
Optimize BatchMatmul to Fully Connected when RHS is reshaped after dequantization.
#97205 opened
Jul 18, 2025 -
Give better error in run_hlo_module if HLO has collectives.
#97208 opened
Jul 19, 2025 -
Automated Code Change
#97209 opened
Jul 19, 2025 -
Change int64 to int64_t in Pow operations for type consistency
#97210 opened
Jul 19, 2025 -
Determine collective support based on #partitions
#97211 opened
Jul 19, 2025 -
Automated Code Change
#97212 opened
Jul 19, 2025 -
Automated Code Change
#97213 opened
Jul 19, 2025 -
Automated Code Change
#97214 opened
Jul 19, 2025 -
Automated Code Change
#97215 opened
Jul 19, 2025 -
Automated Code Change
#97216 opened
Jul 19, 2025 -
Automated Code Change
#97217 opened
Jul 19, 2025 -
Automated Code Change
#97218 opened
Jul 19, 2025 -
Automated Code Change
#97219 opened
Jul 19, 2025 -
Automated Code Change
#97220 opened
Jul 19, 2025 -
Automated Code Change
#97221 opened
Jul 19, 2025 -
Automated Code Change
#97222 opened
Jul 19, 2025 -
[xla:gpu][triton] In squeeze-dims pass, keep at least two dimensions.
#97223 opened
Jul 19, 2025 -
Automated Code Change
#97224 opened
Jul 19, 2025 -
Automated Code Change
#97225 opened
Jul 19, 2025 -
Automated Code Change
#97226 opened
Jul 19, 2025 -
Automated Code Change
#97228 opened
Jul 19, 2025 -
[XLA:GPU][Tiling] Use SmallVector<OneDimTile> to store tiling info.
#97229 opened
Jul 19, 2025 -
Avoid heap allocation for the sub buffer address
#97230 opened
Jul 19, 2025 -
Add a scratch implemention of muon
#97231 opened
Jul 19, 2025
5 Issues closed by 3 people
-
Inconsistent NotEqual broadcasting behavior between CPU and GPU (CPU fails silently, GPU raises error)
#97227 closed
Jul 19, 2025 -
graph execution error bug with tfm.nlp.layers.MultiHeadRelativeAttention
#94599 closed
Jul 18, 2025 -
tensorflow crashes when run with python -OO
#96900 closed
Jul 15, 2025 -
Unable to install TensorFlow: No matching distribution found for TensorFlow!
#79349 closed
Jul 13, 2025 -
Broken Link: Microsoft Visual C++ Redistributable (pip install)
#93826 closed
Jul 12, 2025
11 Issues opened by 5 people
-
Inconsistent behavior for `tf.raw_ops.NotEqual` between CPU and GPU with non-broadcastable shapes
#97204 opened
Jul 18, 2025 -
could you add support of the new optimizer: Muon
#97187 opened
Jul 18, 2025 -
`tf.nn.depthwise_conv2d` crashes with large `strides` values when ONEDNN is enabled
#97165 opened
Jul 18, 2025 -
`tf.pow` returns inconsistent value on CPU vs GPU
#97125 opened
Jul 17, 2025 -
`tf.nn.local_response_normalization` returns incorrect output
#97105 opened
Jul 17, 2025 -
`tf.linalg.matrix_rank` produces inconsistent output on CPU vs GPU with `tol=6`
#97102 opened
Jul 17, 2025 -
`tf.math.argmax` throws `InvalidArgumentError` with valid `axis` of `int16` dtype
#97096 opened
Jul 17, 2025 -
Tensorflow 2.19 fails to load after Pyside6
#97058 opened
Jul 16, 2025 -
`tf.experimental.numpy.cumsum` handles overflow inconsistently on CPU and GPU
#97042 opened
Jul 16, 2025 -
Core Dump When Training
#97016 opened
Jul 16, 2025 -
Will TF supprot triton at future
#96876 opened
Jul 13, 2025
53 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Stable delegate python api
#93850 commented on
Jul 16, 2025 • 1 new comment -
Update Docs to latest release
#89084 commented on
Jul 12, 2025 • 0 new comments -
`tf.nn.conv2d_transpose` crashes with "Illegal instruction (core dumped)"
#93733 commented on
Jul 18, 2025 • 0 new comments -
[Compatibility][Upgrade] TensorFlow 2.x to 2.15.0: Dependency Conflict and Version Downgrade Issue
#96694 commented on
Jul 19, 2025 • 0 new comments -
Build Error While Compiling TensorFlow Lite Using CMake
#96654 commented on
Jul 19, 2025 • 0 new comments -
Memory leak in tf.data when iterating over Dataset.from_generator
#65675 commented on
Jul 19, 2025 • 0 new comments -
tfLite. Select fastest available GPU
#88039 commented on
Jul 14, 2025 • 0 new comments -
[XLA] Add stack trace breakdown to `HloLiveRange::ToString` for peak memory usage
#94954 commented on
Jul 18, 2025 • 0 new comments -
io_utils: prevent `input()` crash in non-interactive mode
#95525 commented on
Jul 16, 2025 • 0 new comments -
Update Protobuf to 6.31.1
#95873 commented on
Jul 14, 2025 • 0 new comments -
Add metadata for CUDA and libtpu versions
#95903 commented on
Jul 15, 2025 • 0 new comments -
Remove LiteRT modules from TF python deps.
#95991 commented on
Jul 18, 2025 • 0 new comments -
Add a Reflection Map to `emitc` class
#96263 commented on
Jul 18, 2025 • 0 new comments -
[XLA] Remove dead argument in ProtoToHumanReadableJson
#96582 commented on
Jul 14, 2025 • 0 new comments -
PR #19067: [XLA:CPU][oneDNN] Move simplification pass before oneDNN pass
#96617 commented on
Jul 19, 2025 • 0 new comments -
#sdy Remove MHLO shardings from round-trip export
#96640 commented on
Jul 17, 2025 • 0 new comments -
Use RPG's solution as a hint to CP-SAT
#96674 commented on
Jul 17, 2025 • 0 new comments -
[XLA:benchmarks] Test Nvidia benchmarks from https://github.com/openxla/xla/pull/28728
#96678 commented on
Jul 15, 2025 • 0 new comments -
[XLA] Add stack trace breakdown to `HloLiveRange::ToString` for peak memory usage
#96754 commented on
Jul 13, 2025 • 0 new comments -
Add an option to do multiple executions of the same module to HloRunners.
#96807 commented on
Jul 15, 2025 • 0 new comments -
Add Hermetic C++ Toolchains for Linux x86_64 builds.
#96820 commented on
Jul 17, 2025 • 0 new comments -
Automated Code Change
#96833 commented on
Jul 14, 2025 • 0 new comments -
Automated Code Change
#96838 commented on
Jul 14, 2025 • 0 new comments -
Automated Code Change
#96839 commented on
Jul 14, 2025 • 0 new comments -
Automated Code Change
#96841 commented on
Jul 14, 2025 • 0 new comments -
Automated Code Change
#96842 commented on
Jul 14, 2025 • 0 new comments -
Automated Code Change
#96844 commented on
Jul 14, 2025 • 0 new comments -
GPU Not Detected by TensorFlow Despite Proper System Setup
#96707 commented on
Jul 14, 2025 • 0 new comments -
Multiple segmentation faults and aborted in some modules
#96209 commented on
Jul 14, 2025 • 0 new comments -
Some sorting related ops produce results inconsistent with NumPy when tensor contains NaN
#95235 commented on
Jul 14, 2025 • 0 new comments -
tf.data.experimental.prefetch_to_device has no effect inside tf.distribute.Strategy.distribute_datasets_from_function.
#94735 commented on
Jul 14, 2025 • 0 new comments -
TensorFlow Docker `tensorflow/tensorflow:latest-gpu` fails to detect GPU due to CUDA/cuDNN mismatch
#94593 commented on
Jul 14, 2025 • 0 new comments -
TensorFlow disables SwiftUI Previews
#95106 commented on
Jul 14, 2025 • 0 new comments -
xprof compilation fails with gcc 14.2
#94035 commented on
Jul 14, 2025 • 0 new comments -
Psycopg crashes using OpenSSL if tensorflow is imported beforehand
#93969 commented on
Jul 14, 2025 • 0 new comments -
lib new version not support 16kb pages in android
#96602 commented on
Jul 14, 2025 • 0 new comments -
how to use libtensorflowlite_c.so C API and delegate gpu opencl correctly?
#95795 commented on
Jul 15, 2025 • 0 new comments -
TensorFlow Java documentation is outdated
#96799 commented on
Jul 15, 2025 • 0 new comments -
Dataset.ragged_batch does not produce correct specs with tf.py_function and tf.numpy_function
#60710 commented on
Jul 15, 2025 • 0 new comments -
`tf.split` or `tf.transpose` cause errors for quantize-aware training with `quantize_apply`
#60714 commented on
Jul 15, 2025 • 0 new comments -
Incorrect gradient in divide_no_nan and reciprocal_no_nan when divide by 0
#60715 commented on
Jul 15, 2025 • 0 new comments -
failed to build branch r2.13
#60716 commented on
Jul 15, 2025 • 0 new comments -
Remove or update zh-cn translation from installation instructions
#62245 commented on
Jul 15, 2025 • 0 new comments -
Convolution: CPU memory increase with growing number of different sequence lengths
#62441 commented on
Jul 15, 2025 • 0 new comments -
tf.strings.to_number cannot convert positive integers prefixed with "+" when out_type is tf.int32 or tf.int64
#62191 commented on
Jul 15, 2025 • 0 new comments -
Arch Linux x86 RTX 3050 source compilation Error in fail: tensorflow/core:stream_executor_headers_lib cannot depend on tensorflow/core:lib
#96445 commented on
Jul 15, 2025 • 0 new comments -
Depth anything V2 Tflite outputs constants on qualcomm gpus
#93476 commented on
Jul 15, 2025 • 0 new comments -
how to get profile of per operation that delegate gpu opencl like cpu enable_op_profiling result, rather than only TfLiteGpuDelegateV2 ?
#96239 commented on
Jul 16, 2025 • 0 new comments -
crash when inference use libtensorflowlite_c.so and config threadnum >1 backend cpu
#96347 commented on
Jul 16, 2025 • 0 new comments -
Prebuilt binaries do not work with CPUs that do not have AVX instruction sets.
#19584 commented on
Jul 16, 2025 • 0 new comments -
TensorFlow DLL failed to load with newer version of TF
#91656 commented on
Jul 17, 2025 • 0 new comments -
YoloX different Model Output for Python and Android
#95489 commented on
Jul 17, 2025 • 0 new comments -
Fail to build libtensorflow_framework.so.2.20.0
#96569 commented on
Jul 18, 2025 • 0 new comments