• R/O
  • HTTP
  • SSH
  • HTTPS

List of commits

Tags
No Tags

Frequently used words (click to add to your profile)

javac++androidlinuxc#windowsobjective-ccocoa誰得qtpythonphprubygameguibathyscaphec計画中(planning stage)翻訳omegatframeworktwitterdomtestvb.netdirectxゲームエンジンbtronarduinopreviewer

GCC with patches for OS216


RSS
Rev. Time Author
cb0047f devel/autopar_devel 2020-07-03 00:17:32 Giuliano Belinassi

Implement new partitioning algorithm

Previously, we tried to follow strictly how add_symbol_to_partition
behaves when adding nodes to decide when to merge them. This new
partitioner takes some freedom to explore this, promoting statics
to globals, for performance.

We also comment some assertions which might require future proper fixes.

gcc/ChangeLog
2020-07-02 Giuliano Belinassi <giuliano.belinassi@usp.br>

* cgraphunit.c (ipa_passes): Call lto_merge_comdat_map.
* ipa-cp.c (initialize_node_lattices): Comment assertion for now.
* ipa-profile.c (ipa_propagate_frequency): Comment assertion for now.
* lto-partition.c (merge_comdat_nodes): New function.
(privatize_symbol_name_1): New argument wpa.
(privatize_symbol_name): Check if in WPA mode.
(lto_merge_comdat_map): New function.
* lto-partition.h: Declare lto_merge_comdat_map.

6e9a060 2020-07-01 09:16:31 Giuliano Belinassi

Add dump information to lto_max_no_alonevap_map

This commit adds dump information to the lto_max_no_alonevap_map
partitioner if dump_file is provided.

gcc/ChangeLog
2020-06-30 Giuliano Belinassi <giuliano.belinassi@usp.br>

* lto-partition.c (analyse_symbol_references): Dump information if
dump_file is provided.
(analyse_symbol_1): Same as above.
(analyse_symbol_1): Same as above.
(lto_max_no_alonevap_map): Same as above.

f17ad98 2020-06-30 12:29:21 Giuliano Belinassi

Better handle statics.

Previously, we merge any reference to static var or function. Now we are
more careful by looking if it is publically available, and avoid merging
partitions which have references to a non-static variable.

2020-06-30 Giuliano Belinassi <giuliano.belinassi@usp.br>
* lto-cgraph.c (lto_apply_partition_mask): Avoid removal of nodes
in partition.
(maybe_release_function_dominators): Move from.
* cgraph.c (maybe_release_dominators): To here.
* ipa-visibility.c (localize_node): Skip assertion if
split_outputs.
* lto-partition.c (analyse_symbol_references): Merge
partitions if reference to static varnode instead of any global.
(promote_symbol): Correctly handle statics.
(lto_max_no_alonevap_map): Remove quick returns.
(privatize_symbol_name_1): Implement hashing of static names.

e778bf0 2020-06-25 06:17:54 Giuliano Belinassi

Merge small partitions.

Current partitioner seems to create one huge partitions and several
other small partitions. This commit improves the partitioning
algorithm by merging small partitions together. It also avoids wasting
time partitioning if the program size is `small'. Note that small here
is subjective.

gcc/ChangeLog:
2020-06-24 Giuliano Belinassi <giuliano.belinassi@usp.br>

* cgraphunit.c (ipa_passes): Assert for non-empty partition.
* lto-partition.c: Remove currrent_working_partition.
(add_symbol_to_partition): Remove useless check.
(lto_max_no_alonevap_map): Use an heuristic to merge small
partitions. Also fix creation of empty partitions.

3fd18ee 2020-06-24 08:48:01 Giuliano Belinassi

Run ipa passes when split_outputs

Previously, a bug prevented the ipa passes to run when split_outputs
is provided. This commit fixes that by correctly setting the guard,
and updates how the flags in the partition boundary accordingly.

gcc/ChangeLog
2020-06-23 Giuliano Belinassi <giuliano.belinassi@usp.br>

* cgraphunit.c (ipa_passes): Run ipa passes also when
split_outputs.
* ipa-icf.c (gate): Don't run when split_outputs.
* lto-cgraph.c (lto_apply_partition_mask): Correctly set nodes in
the partition boundary.

35bfd2e 2020-06-19 05:21:54 Giuliano Belinassi

Manage to get bootstrap to work.

Previously, Bootstrap was not working by the following reasons:

* The gimplifier was emiting assembler early on the compilation
* The driver was spamming as many processes in parallel as it
could (fork bomb), crashing the computer.

In this commit we fix these issues.

gcc/ChangeLog
2020-06-18 Giuliano Belinassi <giuliano.belinassi@usp.br>

* toplev.c (lang_dependent_init): Move call to init_asm output to
* cgraphunit.c (compile): Here. Also run handle_additional_asm if
split_outputs is provided.
* gcc.c (append_split_ouputs): Record asm temporary file.
(execute): Run a max of 4 jobs in parallel instead of n_commands.
* cgraph.h: Update finalize_compilation_unit and compile
declarations.

2020-06-18 Richard Biener <rguenther@suse.de>

* varasm.c (assemble_variable): Make sure to not
defer output when outputting addressed constants.
(output_constant_def_contents): Likewise.
(add_constant_to_table): Take and pass on whether to
defer output.
(output_addressed_constants): Likewise.
(output_constant_def): Pass on whether to defer output
to add_constant_to_table.
(tree_output_constant_def): Defer output of constants.

affffe8 2020-06-18 09:04:45 Giuliano Belinassi

Fix undefined reference when linking to libc

When compiling with -fvisibility=hidden, libc functions was being
marked as hidden and then ld could not be linked with libc. This
commit fixes that.

It also temporarly fix issues with undefined references with functions
that get passed down as argument and called in another function by
looking for references inside the function.

gcc/ChangeLog
2020-06-17 Giuliano Belinassi <giuliano.belinassi@usp.br>

* cgraphunit.c (ipa_passes): Flush asm file, and handle
crash of child process.
* gcc.c (execute): Print additional ld call if verbose flag is
provided.
* ipa-visibility.c (gate): Run if split_outputs.
* lto-cgraph.c (lto_apply_partition_mask): Only mark body_removed
if node is not a defintion.
* lto-partition.c (analyse_symbol_references): Merge partitions of
functions which we take address of.
(lto_promote_cross_file_statics): call promote_symbol only if
promote is provided.
* lto-partition.h: update lto_promote_cross_file_statics
declaration.
* config/i386/i386-expand.c (ix86_expand_builtin): Initialize
icode.

3ec7662 2020-06-09 12:10:07 Giuliano Belinassi

Implement a new partitioning algorithm

Implements a new partitioning algorithm based on add_node_to_partition
logic, still using a union find datastructure for quick merging
partitions. libgcc is compiling, libstdc++ fails to link.

gcc/ChangeLog
2020-06-08 Giuliano Belinassi <giuliano.belinassi@usp.br>

* cgraphunit.c (cgraph_node::expand): Quick return if no body is
available.
(ipa_passes): Skip empty partitions.
* gcc.c (append_split_outputs): Fix misleading identation.
* ipa.c (symbol_table::remove_unreachable_nodes): Assert for
split_outputs.
* lto-cgraph.c (lto_apply_partition_mask): set body_removed when
body is removed.
* lto-partition.c: (current_working_partition): New variable.
* (add_symbol_to_partition): Check if insertion on correct
partition. Also check if nodes from the same COMDAT group are
mapped to the partition.
(union_find::print_roots): New function.
(ds): New variable.
(ds_print_roots): New function.
(analyse_symbol_references): New function.
(analyse_symbol_1): New function.
(int_cast): Remove.
(lto_max_no_alonevap_map): Replace with new algorithm.

374ee31 2020-06-05 13:00:51 Giuliano Belinassi

Make libgcc compile

Finally, we managed to get libgcc to compile with this version.
Changes to the partitioner were necesary for this, such as
merging partitions with calls to static functions in common.

gcc/ChangeLog
2020-06-05 Giuliano Belinassi <giuliano.belinassi@usp.br>

* cgraph.h (symtab_node): New attribute aux2.
* cgraphunit.c (ipa_passes): Decide not to compile in parallel.
* gcc.c (has_hidden_E): New function.
* (append_split_outputs): Add fPIC and abort when a hidden -E is
provided.
* (execute): Do not call append_split_outputs when -E is provided.
* lto-partition.c: Merge calls to static functions to same
partition.
* (lto_check_usage_from_other_partitions): Update
used_from_other_partitions to nodes other than varpool.

gcc/testsuite/ChangeLog
2020-06-05 Giuliano Belinassi <giuliano.belinassi@usp.br>

* gcc.dg/driver/driver.exp: New test.
* gcc.dg/driver/empty.c: New file.

735b626 2020-06-05 04:20:53 Giuliano Belinassi

Implement a new partitioner.

Here we implement a new partitioner that asserts the following
property: Every function that references a varpool node gets in
the same partition as the varpool node, so that we don't need to
export variables that may conflict with other Compilation Units.

gcc/ChangeLog:
2020-06-04 Giuliano Belinassi <giuliano.belinassi@usp.br>

* lto-partition.c (class union_find): New class.
(lto_max_no_alonevap_map): New function.
(int_cast): New function.
* lto-partition.h (lto_max_no_alonevap_map): Declare.
* cgraphunit.c (ipa_passes): Call lto_max_no_alonevap_map.

7e9bf3c 2020-06-04 05:27:03 Giuliano Belinassi

Manage to compile some programs

Finally, manage to compile some programs. Here we merge lto-partition
to the compiler, correctly updates used_from_other_partition, and
release function bodies of functions that will not be used anymore.
We still are running the compilation serially for testing, but we
are forking and applying the partition mask correctly.

There is a bug when symbols of distinct CU collide when linking
for now.

gcc/ChangeLog
2020-06-03 Giuliano Belinassi <giuliano.belinassi@usp.br>

* cgraph.c: (release_function_body): Reinsert dom_info_available_p
check.
* cgraphunit.c (ipa_passes): Check for split_outputs, correctly
check for variable usage from other partitions, and handle the
additional assembler file correctly.
* gcc.c (get_file_by_lines): Returns file existence. Fix double
free.
(append_file_outputs): Add -fPIE when -c is provided, and quickly
returns if additional asm file not found.
(maybe_run_linker): Check if split outputs case by checking
temp_object_files length equals zero.
* ipa-visibility.c (localize_node): Reinsert assertion check.
* lto-cgraph.c (maybe_release_function_dominators): New.
(lto_apply_partition_mask): Release data not required anymore.
* lto-partition.c (lto_promote_corss_file_statics): Reinsert
assertion check.
(lto_check_usage_from_other_partitions): New.
* lto-partition.h (lto_check_usage_from_other_partitions): Declare.

gcc/lto/ChangeLog
2020-06-03 Giuliano Belinassi <giuliano.belinassi@usp.br>

* lto-partition.c: Remove.
* Make-lang.in: Remove lto-partition.o.

1b05b41 2020-05-30 11:16:35 Giuliano Belinassi

Run partitioner after IPA.

We now run the partitioner after IPA analysis. We still require to
manage the assembler file to open multiple instances of these, as well
as check why some "safety" checks require to be removed.

gcc/ChangeLog
2020-05-29 Giuliano Belinassi <giuliano.belinassi@usp.br>

* Makefile.in (lto-partition.o): Add.
* cgraph.c (release_function_body): Remove dom_info_available_p
check.
* cgraphunit.c (ipa_passes): Run partitioner after
execute_ipa_summary_passes.
* ipa-visibility.c (optimize_weakref): Remove whole_program or
in_lto_p check.
* lto-cgraph.c (lto_apply_partition_mask): New.
* lto-streamer.h (ltrans_partition_def: Declare.
* (lto_apply_partition_mask): Expose symbol externally.

0b21709 2020-05-28 08:38:08 Giuliano Belinassi

Add test for driver

This commit adds some tests to the `gcc' driver, which are already
covered by bootstrap. However, here we have a quick way of testing
things.

gcc/testsuite/ChangeLog
2020-05-27 Giuliano Belinassi <giuliano.belinassi@usp.br>

* gcc.dg/driver/driver.exp: New test.
* gcc.dg/driver/a.c: New file.
* gcc.dg/driver/b.c: New file.
* gcc.dg/driver/a.c: New file.

6600738 2020-05-22 02:45:54 Giuliano Belinassi

Implement remaining cases

There was two cases which required additional attention:
* When -c and -o is provided.
* When cc1* is called to compile an .S file.

Here we fix both of these cases, and also initilizes the additional-asm
file in a more reliable place.

Bootstrap is working here.

gcc/ChangeLog
2020-05-20 Giuliano Belinassi <giuliano.belinassi@usp.br>

* gcc.c (get_file_by_lines): Accept a path to file instead of FILE
object
(identify_asm_file): Better object identification heuristic.
(append_split_outputs): Return if temporary asm file was not
created, and also implement -o case.
(fsplit_arg): Remove .s extension from temporary file.
* toplev.c (init_additional_asm_names_file): Close and flush file.
(do_compile): init_additional_asm_names_file here instead of.
(land_dependent_init): Here.

9486636 2020-05-20 00:54:03 Giuliano Belinassi

Fix compilation with multiple -c

Previous commit had an issue when calling

gcc -c file1.c file2.c

the expected result would be one .o file for each .c file. Previously
we were merging the object files.

gcc/ChangeLog
2020-05-18 Giuliano Belinassi <giuliano.belinassi@usp.br>

* gcc.c (append_split_outputs): Truncate temp_obj vector for the
next compilation.

ce9f7bc 2020-05-19 11:18:08 Giuliano Belinassi

Queue up additional call to ld

Add an extra ld call to handle the -c case, creating a final object
file.

gcc/ChangeLog
2020-05-18 Giuliano Belinassi <giuliano.belinassi@usp.br>

* gcc.c (EMPTY_CMD): New macro.
(temp_object_files): New variable.
(get_path_to_ld): New function.
(append_split_outputs): Queue up a call to ld.
(await_commands_to_finish): New function.
(split_commands): Same as above.
(parse_argbuf): Same as above.
(execute): Refactor based on new functions, plus call
additional ld when necessary.
(fsplit_arg): Append .s to temporary file.

25fe346 2020-05-16 00:33:47 Giuliano Belinassi

Handle `as' calls when splitting asm output.

Read the temporary additional asm file provided by cc1*, and call the
assembler for each of them.

gcc/ChangeLog
2020-05-14 Giuliano Belinassi <giuliano.belinassi@usp.br>

* gcc.c (extra_arg_storer): New function store.
(extra_args): Move to execute.
(get_file_by_lines): New function.
(identify_asm_files): New function.
(struct infile): Move up.
(current_infile): New variable.
(append_list_outputs): Handle the `as' case.
(execute): Use XNEWVEC instead of alloca.
* toplev.c (init_additional_asm_names_file): Break file list with
newline.
(lang_dependent_init): Provide correct asm filename.

dc1ede0 2020-05-16 00:15:34 Giuliano Belinassi

Revert "Handle `as' calls when splitting asm output."

This commit has a missing 'ChangeLog'

This reverts commit 223f59d53e23f58cf00f9dc3153e14e31597ff34.

223f59d 2020-05-15 10:05:31 Giuliano Belinassi

Handle `as' calls when splitting asm output.

Read the temporary additional asm file provided by cc1*, and call the
assembler for each of them.

23a55a9 2020-05-13 08:40:45 Giuliano Belinassi

Open temporary output file on cc1*

Open temporary file used for additional assembler output on the
compiler side. It also refactors and prepares the ground on the
driver side for adding support to the driver side.

gcc/ChangeLog
2020-05-12 Giuliano Belinassi <giuliano.belinassi@usp.br>

* gcc.c (extra_arg_storer): New variable extra_args.
(extra_args): New varible.
(is_assembler): Check if arg is an assembler.
(append_split_outputs): Refactor to check if argument is assembler.
(print_argbuf): New function.
(struct infile): New attribute.
(fsplit_arg): New function.
(current_infile): New file.A
* toplev.c (additional_asm_filenames): New variable.
(init_additional_asm_names_file): New.
(lang_dependent_init): Call init_additional_asm_names_file.

0fa420a 2020-05-08 06:20:03 Giuliano Belinassi

Do not call append_arg_storer if -S is provided

The flag -fsplit-outputs= should not be appended to cc1 if -S is
provided, as there is no easy way to do parallelism here.

gcc/ChangeLog
2020-05-07 Giuliano Belinassi <giuliano.belinassi@usp.br>

* gcc.c (execute): Don't call append_arg_storer if -S is provided.
(have_S): New variable.

d523422 2020-05-08 01:38:42 Giuliano Belinassi

Append '-fsplit-output=' to all cc1 calls

Add a new '-fsplit-outputs=' option so that the driver can pass an
extra filename to cc1*. This file will be used to write the filename
of each CU splitten by the compiler.

gcc/ChangeLog
2020-05-07 Giuliano Belinassi <giuliano.belinassi@usp.br>

* common.opt (fsplit-output=): New option.
* gcc.c (execute): Call append_split_outputs.
(is_compiler): New function.
(get_number_of_args): New function.
(extra_arg_storer): New class.
(append_split_outputs): New function.
(print_command): New debug unction.
(print_commands): New debug function.

5269b24 2020-05-05 19:42:22 Eric Botcazou

Silence warning in LTO mode on VxWorks

The link phase is always partial (-r) for VxWorks in kernel mode, which
means that it uses incremental LTO linking by default (-flinker-output=rel).
But in this mode the LTO plugin outputs a warning if one of the object files
involved in the link does not contain LTO bytecode, before switching to
nolto-rel mode. We do not do repeated incremental linking for VxWorks so
silence the warning.

lto-plugin/
* lto-plugin.c: Document -linker-output-auto-notlo-rel option.
(linker_output_set): Change type to bool.
(linker_output_known): Likewise.
(linker_output_auto_nolto_rel): New variable.
(all_symbols_read_handler): Take it into account.
<LDPO_REL>: Do not issue the warning if it is set.
(process_option): Process -linker-output-auto-notlo-rel.
(cleanup_handler): Remove unused variable.
(onload) <LDPT_LINKER_OUTPUT>: Adjust to above type change.
gcc/
* gcc.c (LTO_PLUGIN_SPEC): Define if not already.
(LINK_PLUGIN_SPEC): Execute LTO_PLUGIN_SPEC.
* config/vxworks.h (LTO_PLUGIN_SPEC): Define.

2badc98 2020-05-05 19:39:09 Eric Botcazou

Do not put incomplete CONSTRUCTORs into static memory

The CONSTRUCTOR_NO_CLEARING flag was invented to avoid generating a memset
for CONSTRUCTORS that lack elements, but it turns out that the gimplifier
can generate a memcpy for them instead, which is worse performance-wise,
so this prevents it from doing that for them.

* gimplify.c (gimplify_init_constructor): Do not put the constructor
into static memory if it is not complete.

0424a5e 2020-05-05 19:35:05 Richard Biener

tree-optimization/94949 - fix load eliding in SM

This fixes the case of not using the multithreaded model when
only conditionally storing to the destination. We cannot elide
the load in this case.

2020-05-05 Richard Biener <rguenther@suse.de>

PR tree-optimization/94949
* tree-ssa-loop-im.c (execute_sm): Check whether we use
the multithreaded model or always compute the stored value
before eliding a load.

* gcc.dg/torture/pr94949.c: New testcase.

1bd3a8a 2020-05-05 18:40:24 Alex Coplan

aarch64: eliminate redundant zero extend after bitwise negation

The attached patch eliminates a redundant zero extend from the AArch64 backend. Given the following C code:

unsigned long long foo(unsigned a)
{
return ~a;
}

prior to this patch, AArch64 GCC at -O2 generates:

foo:
mvn w0, w0
uxtw x0, w0
ret

but the uxtw is redundant, since the mvn clears the upper half of the x0 register. After applying this patch, GCC at -O2 gives:

foo:
mvn w0, w0
ret

Testing:
Added regression test which passes after applying the change to aarch64.md.
Full bootstrap and regression on aarch64-linux with no additional failures.

* config/aarch64/aarch64.md (*one_cmpl_zero_extend): New.
* gcc.target/aarch64/mvn_zero_ext.c: New test.

144aee7 2020-05-05 18:36:47 Jakub Jelinek

match.pd: Canonicalize (x + (x << cst)) into (x * cst2) [PR94800]

The popcount* testcases show yet another creative way to write popcount,
but rather than adjusting the popcount matcher to deal with it, I think
we just should canonicalize those (X + (X << C) to X * (1 + (1 << C))
and (X << C1) + (X << C2) to X * ((1 << C1) + (1 << C2)), because for
multiplication we already have simplification rules that can handle nested
multiplication (X * CST1 * CST2), while the the shifts and adds we have
nothing like that. And user could have written the multiplication anyway,
so if we don't emit the fastest or smallest code for the multiplication by
constant, we should improve that. At least on the testcases seems the
emitted code is reasonable according to cost, except that perhaps we could
in some cases try to improve expansion of vector multiplication by
uniform constant.

2020-05-05 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/94800
* match.pd (X + (X << C) to X * (1 + (1 << C)),
(X << C1) + (X << C2) to X * ((1 << C1) + (1 << C2))): New
canonicalizations.

* gcc.dg/tree-ssa/pr94800.c: New test.
* gcc.dg/tree-ssa/popcount5.c: New test.
* gcc.dg/tree-ssa/popcount5l.c: New test.
* gcc.dg/tree-ssa/popcount5ll.c: New test.

7f91620 2020-05-05 18:34:45 Jakub Jelinek

x86: Fix *vec_dupv4hi constraints [PR94942]

This insn and split splits into HI->V?HImode broadcast for avx2 and later,
but either the operands need to be %xmm0-%xmm15 (i.e. VEX encoded insn), or
the insn needs both AVX512BW and AVX512VL.
Now, Yv constraint is v for AVX512VL and x otherwise, so for -mavx512vl -mno-avx512bw
we ICE if we end up with a %xmm16+ register from RA.
Yw constraint is v for AVX512VL and AVX512BW and nothing otherwise, so
in this pattern we actually need xYw.

2020-05-05 Jakub Jelinek <jakub@redhat.com>

PR target/94942
* config/i386/mmx.md (*vec_dupv4hi): Use xYw constraints instead of Yv.

* gcc.target/i386/pr94942.c: New test.

6d938a5 2020-05-05 18:34:45 Jakub Jelinek

match.pd: Optimize (((type)A * B) >> prec) != 0 into __imag__ .MUL_OVERFLOW [PR94914]

On x86 (the only target with umulv4_optab) one can use mull; seto to check
for overflow instead of performing wider multiplication and performing
comparison on the high bits.

2020-05-05 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/94914
* match.pd ((((type)A * B) >> prec) != 0 to .MUL_OVERFLOW(A, B) != 0):
New simplification.

* gcc.target/i386/pr94914.c: New test.

59e4474 2020-05-05 18:33:55 Uros Bizjak

i386: Use int_nonimmediate_operand more

Pattern explosing and manual mode checks can be avoided by using
int_nonimmediate_operand special predicate.

While there, rewrite *x86_mov<SWI48:mode>cc_0_m1_neg_leu<SWI:mode>
to a combine pass splitter.

* config/i386/i386.md (*testqi_ext_3): Use
int_nonimmediate_operand instead of manual mode checks.
(*x86_mov<SWI48:mode>cc_0_m1_neg_leu<SWI:mode>):
Use int_nonimmediate_operand predicate. Rewrite
define_insn_and_split pattern to a combine pass splitter.