| Age | Commit message (Collapse) | Author |
|
CHARACTER dummy arguments were treated as local variables in debug info.
This happened because our method to get the argument number was not
robust. It relied on `DeclareOp` having a direct reference to arguments
which was not the case for character arguments. This is fixed by storing
source-level argument positions in `DeclareOp`.
Fixes #112886
|
|
encoding (#164452)
The purpose of this patch is to allow converting FIR array representation to
memref when possible without hitting memref verifier issue.
The issue was that FIR arrays may be assumed size, in which case the
last dimension will not be known at runtime. Flang uses -1 to encode
this to fulfill Fortran 2023 standard requirements in 18.5.3 point 5
about CFI_desc_t.
When arrays are converted to memeref, if this `-1` reaches memeref
operations, it triggers verifier errors (even if the conversion happened
in code that guards the code to be entered at runtime if the array is
assumed-size because folders/verifiers do not take into account
reachability).
This follows-up on discussions in #163505 merge requests
|
|
In Fortran::lower::defineGlobal() don't change the linkage and don't
generate initializer for CDEFINED globals.
|
|
Check for unused entry dummy arrays with BaseBoxType that calls
genUnusedEntryPointBox() before processing array shapes.
Fixes #132648
|
|
components (#157914)
|
|
The allocator index is set from the component genre #157731 . There is
no more need of an operation to set it at a later point.
|
|
Fixes #126453
|
|
As described in
https://discourse.llvm.org/t/rfc-flang-representation-for-objects-inside-physical-storage/88026,
`[hl]fir.declare` should carry information about the layout
of COMMON/EQUIVALENCE variables within the physical storage.
This patch modifes Flang lowering to attach this information.
|
|
(#151408)
New implementation of `MayNeedCopy()` is used to consolidate
copy-in/copy-out checks.
`IsAssumedShape()` and `IsAssumedRank()` were simplified and are both
now in `Fortran::semantics` workspace.
`preparePresentUserCallActualArgument()` in lowering was modified to use
`MayNeedCopyInOut()`
Fixes https://github.com/llvm/llvm-project/issues/138471
|
|
allocation" (#152418)
Reviewed in #152379
- Move the allocator index set up after the allocate statement otherwise
the derived type descriptor is not allocated.
- Support array of derived-type with device component
|
|
allocation" (#152402)
Reverts llvm/llvm-project#152379
Buildbot failure
https://lab.llvm.org/buildbot/#/builders/207/builds/4905
|
|
(#152379)
- Move the allocator index set up after the allocate statement otherwise
the derived type descriptor is not allocated.
- Support array of derived-type with device component
|
|
(#152041)
The descriptor for derived-type with CUDA components are allocated in
managed memory. The lowering was calling the standard runtime on
allocate statement where it should be a `cuf.allocate` operation.
|
|
Some operation creations were updated in flang directory but not all.
Migrate the CUF ops to the new create APIs introduce in #147168
|
|
See https://github.com/llvm/llvm-project/pull/147168 for more info.
|
|
derived-type (#149418)
|
|
ArrayRef(std::nullopt_t) has been deprecated. This patch replaces
std::nullopt with {}.
A subsequence patch will address those places where we need to replace
std::nullopt with mlir::TypeRange{} or mlir::ValueRange{}.
|
|
- Exit early when there is no device components
- Make the retrieval of the record type more robust
|
|
Derived type initialization overwrite the component descriptor. Place
the `cuf.set_allocator_idx` after the initialization is performed.
|
|
This patch fixes:
flang/lib/Lower/ConvertVariable.cpp:787:22: error: unused variable
'fieldTy' [-Werror,-Wunused-variable]
|
|
|
|
memory (#147416)
|
|
memory (#146797)"
This reverts commit 925588cd001a91d592b99e6e7c6bee9514f5a26e.
|
|
(#146797)
Similarly to descriptor for device data, put derived type holding device
descriptor in managed memory.
|
|
Reland #145901 with a fix for shared library builds.
So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.
This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.
This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.
Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.
I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
|
|
Reverts llvm/llvm-project#145901
Broke shared library builds because of the usage of
`skipExternalRttiDefinition` in Lowering.
|
|
So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.
This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.
This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.
Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.
I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
|
|
ArrayRef has a constructor that accepts std::nullopt. This
constructor dates back to the days when we still had llvm::Optional.
Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, I would like to move
away from the constructor and eventually remove it.
This patch replaces std::nullopt with {}. There are a couple of
places where std::nullopt is replaced with TypeRange() to accommodate
perfect forwarding.
|
|
Support odd case where a static object is being declared both as a
common block and a BIND(C) module variable name in different modules,
and both modules are used in the same compilation unit.
This is not standard, but happens when using MPI and MPI_F08 in the same
compilation unit, and at least both gfortran and ifx support this.
See added test case for an illustration.
|
|
`cuf.alloc` and `cuf.free` are converted to `fir.alloca` or deleted when
in device context during the CUFOpConversion pass. Do not generate them
in lowering to avoid confusion.
|
|
Fixes #108136
In #108136 (the new testcase), flang was missing the length parameter
required for the variable length string when boxing the global variable.
The code that is initializing global variables for OpenMP did not
support types with length parameters.
Instead of duplicating this initialization logic in OpenMP, I decided to
use the exact same initialization as is used in the base language
because this will already be well tested and will be updated for any new
types. The difference for OpenMP is that the global variables will be
zero initialized instead of left undefined.
Previously `Fortran::lower::createGlobalInitialization` was used to
share a smaller amount of the logic with the base language lowering. I
think this bug has demonstrated that helper was too low level to be
helpful, and it was only used in OpenMP so I have made it static inside
of ConvertVariable.cpp.
|
|
This patch defines `fir::SafeTempArrayCopyAttrInterface` and the
corresponding
OpenACC/OpenMP related attributes in FIR dialect. The actual
implementations
are just placeholders right now, and array repacking becomes a no-op
if `-fopenacc/-fopenmp` is used for the compilation.
|
|
Added options:
* -f[no-]repack-arrays
* -f[no-]stack-repack-arrays
* -frepack-arrays-contiguity=whole/innermost
|
|
TARGET dummy arrays can be accessed indirectly, so it is unsafe
to repack them.
INTENT(OUT) dummy arrays that require finalization on entry
to their subroutine must be copied-in by `fir.pack_arrays`.
In addition, based on my testing results, I think it will be useful
to document that `LOC` and `IS_CONTIGUOUS` will have different values
for the repacked arrays. I still need to decide where to document
this, so just added a note in the design doc for the time being.
|
|
Basic generation of array repacking operations in Lowering.
|
|
Use `cuf.shared_memory` operation instead of `cuf.alloc` for CUDA shared
variable. These variables do not need free operations.
|
|
Implement handling of `NULL()` RHS, polymorphic pointers, as well as
lower bounds or bounds remapping in pointer assignment inside FORALL.
These cases eventually do not require updating hlfir.region_assign,
lowering can simply prepare the new descriptor for the LHS inside the
RHS region.
Looking more closely at the polymorphic cases, there is not need to call
the runtime, fir.rebox and fir.embox do handle the dynamic type setting
correctly.
After this patch, the last remaining TODO is the allocatable assignment
inside FORALL, which like some cases here, is more likely an accidental
feature given FORALL was deprecated in F2003 at the same time than
allocatable components where added.
|
|
(#130290)
Currently, all derived types are initialized through `_FortranAInitialize`, which is functionally correct, but bears poor runtime performance. This patch falls back on global initialization for "simpler" derived types to speed up the initialization.
Note: this relands #114002 with the fix for the LLVM timeout regressions that have been seen. The fix is to use the added fir.copy to avoid aggregate load/store.
Co-authored-by: NimishMishra <42909663+NimishMishra@users.noreply.github.com>
|
|
Fixes #121720
|
|
(#130278)
Reverts llvm/llvm-project#114002
This causes a regression building cam4_r from spec2017
|
|
|
|
Currently, all derived types are initialized through `_FortranAInitialize`, which is functionally correct, but bears poor runtime performance. This patch falls back on global initialization for "simpler" derived types to speed up the initialization.
|
|
Similar to host flow, do not trigger automatic deallocation at then end
of the main program since anything could happen like a
cudaDevcieReset().
|
|
Dummy arguments from other entry statement that are not live in the current entry have no backing storage, user code referring to them is not allowed to be reached. The compiler was generating initialization/destruction code for them when INTENT(OUT), causing undefined behaviors.
|
|
Reverts llvm/llvm-project#123330
|
|
initializa…" (#123330)
Reverts llvm/llvm-project#123097
Reverting due to buildbot failure
https://lab.llvm.org/buildbot/#/builders/89/builds/14577.
|
|
(#123097)
…tion of global v…" (#123067)"
This reverts commit 44ba43aa2b740878d83a9d6f1d52a333c0d48c22.
Adds the flag to bbc as well.
|
|
v…" (#123067)
Reverts llvm/llvm-project#122144
Reverting due to CI failure
https://lab.llvm.org/buildbot/#/builders/89/builds/14422
|
|
(#122144)
…ariables
Patch adds a flag to control zero initialization of global variables
without default initialization. The default is to zero initialize.
|
|
Allocatable members of privatized derived types must be allocated,
with the same bounds as the original object, whenever that member
is also allocated in it, but Flang was not performing such
initialization.
The `Initialize` runtime function can't perform this task unless
its signature is changed to receive an additional parameter, the
original object, that is needed to find out which allocatable
members, with their bounds, must also be allocated in the clone.
As `Initialize` is used not only for privatization, sometimes this
other object won't even exist, so this new parameter would need
to be optional.
Because of this, it seemed better to add a new runtime function:
`InitializeClone`.
To avoid unnecessary calls, lowering inserts a call to it only for
privatized items that are derived types with allocatable members.
Fixes https://github.com/llvm/llvm-project/issues/114888
Fixes https://github.com/llvm/llvm-project/issues/114889
|